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HUMAN GENOME-DERIVED SINGLE EXON • NUCLEIC ACID PROBES USEFUL 
FOR ANALYSIS OF GENE EXPRESSION IN HUMAN HEART 



CROSS REFERENCE TO RELATED APPLICATIONS 

5 

The present application is a continuation-in-part of U.S. 
patent application serial nos. 0,9/632,366, filed August 3, 
2000 and 09/608,408, filed June 30, 2000; claims the 
benefit under 35 U.S.C. s 119(e) of U . S . provisional patent 

10 application serial nos. 60/236,359, filed September 27, 
2000, 60/234,687, filed September 21, 2000, 60/207,456, 
filed May 26, 2000, and 60/180,312, filed February 4, 2000; 
and further claims the benefit under 35 U.S.C. s 119(a) of 
UK patent application no. 0024263.6, filed October 4, 2000, 

15 the disclosures of which are incorporated herein by 
reference in their entireties. 

REFERENCE TO SEQUENCE LISTING AND INCORPORATION BY 
REFERENCE THEREOF 

20 

The present application includes a Sequence Listing in 
electronic format, filed pursuant to PCT Administrative 
Instructions 801 - 806 on a single CD-R disc, in 
triplicate, containing a file named pto_HEART.txt, created 
25 24 January 2001, having 20, 1.86, 946 bytes. The Sequence 

Listing contained in said file on said disc is incorporated 
herein by reference in its entirety. 

Field of the Invention 

30 

The present invention relates to genome-derived 
single exon microarrays useful for verifying the expression 
of regions of genomic DNA predicted to encode protein. In 
particular, the present invention relates to unique genome- 
35 derived single exon nucleic acid probes expressed in human 
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heart and single exon nucleic acid microarrays that include 
such probes. 



Background of the Invention 
5 For almost two decades following the invention of 

general techniques for nucleic acid sequencing, Sanger et 
al., Proc. Natl. Acad. Sci. USA 70 ( 4 ): 1209-13 (1973); 
Gilbert et al. r Proc. Natl. Acad. Sci. USA 70 (12) : 3581-4 
(1973), these techniques were used principally as tools to 

10 further the understanding of proteins - known or 

suspected — about which a basic foundation of biological 
knowledge had already been built. In many cases, the 
cloning effort that preceded sequence identification had 
been both informed and directed by that antecedent 

15 biological understanding. 

For example, the cloning of the T cell receptor 
for antigen was predicated upon its known or suspected cell 
type-specific expression, by its suspected membrane 
association, and by the predicted assembly of its gene via 

20 T cell-specific somatic recombination. Subsequent 
sequencing efforts at once confirmed and extended 
understanding of this family of proteins. Hedrick et al., 
Nature 308 (5955) : 153-8 (1984). 

More recently, however, the development of high 

25 throughput sequencing methods and devices, in concert with 
large public and private undertakings to sequence the human 
and other genomes, has altered this investigational 
paradigm: today, sequence information often precedes 
understanding of the basic biology of the encoded protein 

30 product. 

One of the approaches to large-scale sequencing 
is predicated upon the proposition that expressed 
sequences - that is, those accessible, through isolation of 
mRNA - are of greatest initial interest. This "expressed 
35 sequence tag" ("EST") approach has already yielded vast 

2 



WO 01/57274 PCT/US01/00666 

amounts of sequence data (see for example Adams et al., 
Science 252:1651 (1991); Williamson, Drug Discov. Today 
4:115 (1999)). For nucleic acids sequenced by this 
approach, often the only biological information that is 
5 known a priori with any certainty is the likelihood of 
biologic expression itself. By virtue of the species and 
tissue from which the mRNA had originally been obtained, 
most such sequences are also annotated with the identity of 
the species and at least one tissue in which expression 

10 appears likely. 

More recently, the pace of genomic sequencing has 
accelerated dramatically. When genomic DNA serves as the 
initial substrate for sequencing efforts, expression cannot 
be presumed; often the only a priori biological information 

15 about the sequence includes the species and chromosome (and 
perhaps chromosomal map location) of origin. 

With the ever-accelerating pace of sequence 
accumulation by directed, EST, and genomic sequencing 
approaches — and in particular, with the accumulation of 

20 sequence information from multiple genera, from multiple 

species within genera, and from multiple individuals within 
a species — there is an increasing need for methods that 
rapidly and effectively permit the functions of nucleic 
sequences to be elucidated. And as such functional 

25 information accumulates, there is a further need for 
methods of storing such functional information in 
meaningful and useful relationship to the sequence itself; 
that is, there is an increasing need for means and 
apparatus for annotating raw sequence data with known or 

30 predicted functional information. 

Although the increase in the pace of genomic 
sequencing is due in large part to technological changes in 
sequencing strategies and instrumentation, Service, Science 
280:995 (1998); Pennisi, Science 283: 1822-1823 (1999), 

35 there is an important functional motivation as well. 

3 
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While it was understood that the EST approach 
would rarely be able to yield sequence information about 
the noncoding portions of the genome, it now also appears 
the EST approach is capable of capturing only a fraction of 
5 a genome's actual expression complexity. 

For example, when the C. elegans genome was fully 
sequenced, gene prediction algorithms identified over 
19,000 potential genes, of which only 7,000 had been found 
by EST sequencing. C. elegans Sequencing Consortium, 

10 Science 282:2012 (1998). Analogously, the recently 

completed sequence of chromosome 2 of Arabidopsis predicts 
over 4000 genes, Lin et al. f Nature, 402:761 (1999), of 
which only about 6% had previously been identified via EST 
sequencing efforts. Although the human genome has the 

15 greatest depth of EST coverage, it is still woefully short 
of surrendering all of its genes. One recent estimate 
suggests that the human genome contains more than 146,000 
genes., which would at this point leave greater than half of 
the genes undiscovered. It is now predicted that many 

20 genes, perhaps 20 to 50%, will only be found by genomic 
sequencing. 

There is, therefore, a need for methods that 
permit the functional regions of genomic sequence — and 
most importantly, but not exclusively, regions that 

25 function to encode genes - to be identified. 

Much of the coding sequence of the human genome 
is not homologous to known genes, making detection of open 
reading frames ("ORFs") and predictions of gene function 
difficult. Computational methods exist for predicting 

30 coding regions in eukaryotic genomes. Gene prediction 
programs such as GRAIL and GRAIL II, Uberbacher et al . , 
Proc. Natl. Acad. Sci . USA 88 (24) : 11261-5 (1991); Xu et 
al., Genet. Eng. 16:241-53 (1994); Uberbacher et al., 
Methods Enzymol. 266:259-81 (1996); GENEFINDER, Solovyev et 

35 al., Nucl. Acids. Res. 22:5156-63 (1994); Solovyev et al., 
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Ismb 5:294-302 (1997); and GENESCAN, Burge et al. r J. Mol . 
Biol. 268:78-94 (1997), predict many putative genes without 
known homology or function. Such programs are known, 
however, to give high false positive rates. Burset et al., 
5 Genomics 34:353-367 (1996). Using a consensus obtained by 
a plurality of such programs is known to increase the 
reliability of calling exons from genomic sequence. 
Ansari-Lari et al. r Genome Res . 8(l):29-40 (1998) 

Identification of functional genes from genomic 

10 data remains, however, an imperfect art. For example, in 
reporting the full sequence of human chromosome 21, the 
Chromosome 21 Mapping and Sequencing Consortium reports 
that prior bioinf ormatic estimates of human gene number may 
need to be revised substantially downwards. Nature 

15 405:311-199 (2000); Reeves, Nature 405 : 283-284 (2000). 

Thus, there is a need for methods and apparatus 
that permit the functions of the regions identified 
bioinf ormatically - and specifically, that permit the 
expression of regions predicted to encode protein — readily 

20 to be confirmed experimentally. 

Recently, the development of nucleic acid 
microarrays has made possible the automated and highly 
parallel measurement of gene expression. Reviewed in 
Schena (ed.), DNA Microarrays : A Practical Approach 

25 (Practical Approach Series ), Oxford University Press (1999) 
(ISBN: 0199637768); Nature Genet. 21 (1) (suppl) : 1 - 60 
(1999); Schena (ed.), Microarray Biochip: Tools and 
Technology , Eaton Publishing Company/BioTechniques Books 
Division (2000) (ISBN: 1881299376) . 

30 It is common for microarrays to be derived from 

cDNA/EST libraries, either from those previously described 
in the literature, such as those from the I.M.A.G.E. 
consortium, Lennon et al. r Genomics 33(1): 151-2 (1996), or 
from the construction of "problem specific" libraries 

35 targeted at a particular biological question, R.S. Thomas 
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et al., Cancer Res. (in press). Such microarrays by 
definition can measure expression only of those genes found 
in EST libraries, and thus have not been useful as probes 
for genes discovered solely by genomic sequencing. 
5 The utility of using whole genome nucleic acid 

microarrays to answer certain biological questions has been 
demonstrated for the yeast Saccharomyces cerevisiae. De 
Risi et al. r Science 278:680 (1997). The vast majority of 
yeast nuclear genes, approximately 95% however, are single 

10 exon genes, i.e., lack introns, Lopez et al., RNA 5:1135- 
1137 (1999); Goffeau et al., Science 274:563-67 (1996), 
permitting coding regions more readily to be identified. 
Whole genome nucleic acid microarrays have not generally 
been used to probe gene expression from more complex 

15 eukaryotic genomes, and in particular from those averaging 
more than one intron per gene. 

Diseases of the heart and vascular system are a 
significant cause of human morbidity and mortality. 
Increasingly, genetic factors are being found that 

20 contribute to predisposition, onset, and/or aggressiveness 
of most, if not all, of these diseases. Although mutations 
in single genes have on occasion been identified as 
causative, these disorders are for the most part believed 
to have polygenic etiologies. There is a need for methods 

25 and apparatus that permit prediction, diagnosis and 

prognosis of diseases of the human heart, particularly 
those diseases with polygenic etiology. 

Summary of the Invention 

30 

The present invention solves these and other 
problems in the art by providing methods and apparatus for 
predicting, confirming, and displaying functional 
information derived from genomic sequence. The present 
35 invention also provides apparatus for verifying the 

6 



WO 01/57274 PCT/US01/00666 

expression of putative genes identified within genomic 
sequence . 

In particular, the invention provides novel 
genome-derived single exon nucleic acid microarrays useful 
5 for verifying the expression of putative genes identified 
within genomic sequence. 

The present invention also provides compositions 
and kits for the ready production of nucleic acids 
identical in sequence to, or substantially identical in ' 
10 sequence to, probes on the genome-derived single exon 
microarrays of the present invention. 

Accordingly, in a first aspect of the invention, 
there is provided a spatially-addressable set of single 
exon nucleic acid probes for measuring .gene expression in a 
15 sample derived from human heart, comprising a plurality of 
single exon nucleic acid probes according to any one of the 
nucleotide sequences set out in SEQ ID NOs : 1 - 9,980 or a 
complementary sequence, or a portion of such a sequence. 

By plurality is meant at least two, suitably at 
20 least 20, most suitably at least 100, preferably at least 
1000 and, most preferably, upto 5000. 

In one embodiment of the first aspect, each of 
said plurality of probes is separately and addressably 
amplif iable. 

25 In an alternative embodiment, each of said 

plurality of probes is separately and addressably 

isolatable from said plurality. 

In a preferred embodiment, each of said plurality 

of probes is amplif iable using at least one common primer. 
30 Preferably, each of said plurality of probes is amplifiable 

using a first and a second common primer. 

In yet another embodiment, said set of single 

exon nucleic acid probes comprises between 50 - 20,000 

probes, for example, 50 - 5000. 
35 Suitably, said set of single exon nucleic acid 
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probes comprises at least 50 - 1000 discrete single exon 
nucleic acid probes having a sequence as set out in any of 
SEQ ID NOS.: 1 - 19,771 or a complimentary sequence, or a 
portion of such a sequence. 
5 Preferably, the average length of the single exon 

nucleic acid probes is between 200 and 500 bp. It is 
preferred that the average length should be at least 200bp, 
suitably at least 250bp, most suitably at least 300bp, 
preferably at least 400bp and, most preferably, 500 bp. 

10 In another embodiment, the single exon nucleic 

acid probes lack prokaryotic and bacteriophage vector 
sequence. It is preferred that at least 50%, suitably at 
least 60%, most suitably at least 70%, preferably at least 
75%, more preferably at least 80, 85, 90, 95 or 99% of said 

15 single exon nucleic acid probes lack prokaryotic and 
bacteriophage vector sequence. 

In another preferred embodiment, said single exon 
nucleic acid lack homopolymeric stretches of A or T. It is 
preferred that at least 50%, suitably at least 60%, most 

20 suitably at least 70%, preferably at least 75%, more 

preferably at least 80, 85, 90, 95 or. 99% of said single 
exon nucleic acid probes lack homopolymeric stretches of A 
or T. 

Preferably, a spatially-addressable set of single 
25 exon nucleic acid probes in accordance with the first 

aspect of the invention is is addressably disposed upon a 
substrate . 

Suitable substrates include a filter membrane 
which may, preferably, be nitrocellulose or nylon. The 

30 nylon may preferably, be positively-charged. Other suitable 
substrates include glass, amorphous silicon, crystalline 
silicon, and plastic. Further suitable materials include 
polymethylacrylic, polyethylene, polypropylene, 
polyacrylate, polymethylmethacrylate, polyvinylchloride, 

35 polytetrafluoroethylene, polystyrene, polycarbonate, 

8 
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polyacetal, polysulfone, celluloseacetate, 
cellulosenitrate, nitrocellulose, and mixtures thereof. 

In a second aspect of the invention, there is 
provided a microarray comprising a spatially addressable 
5 set of single exon nucleic acid probes in accordance with 
the first aspect of the invention. 

In one embodiment, a genome-derived single-exon 
microarray is packaged together with such an ordered set of 
amplifiable probes corresponding to the probes, or one or 
10 more subsets of probes, thereon. In alternative 

embodiments, the ordered set of amplifiable probes is 
packaged separately from the genome-derived single exon 
microarray. 

In another aspect, the invention provides genome- 

15 derived single exon nucleic acid probes useful for gene 
expression analysis, and particularly for gene expression 
analysis by microarray. In particular embodiments of this 
aspect, the present invention provides human single-exon 
probes that include specif ically-hybridizable fragments of 

20 SEQ ID Nos. 9,981 - 19,771, wherein the fragment hybridizes 
at high stringency to an expressed human gene. In 
particular embodiments, the invention provides single exon 
probes comprising SEQ ID Nos. 1 - 9,980. 

Accordingly, in a third aspect of the invention, 

25 there is provided a single exon nucleic acid probe for 
measuring human gene expression in a sample derived from 
human heart which is a nucleic acid molecule comprising a 
nucleotide sequence as set out in any of SEQ ID NOs . : 1 - 
9,980 or a complementary sequence or a fragment thereof 

30 wherein said probe hybridizes at high stringency to a 
nucleic acid expressed in the human heart. 

In one embodiment, a single exon nucleic acid 
probe in accordance with the third aspect comprises a 
nucleotide sequence as set out in any of SEQ ID NOs.: 9,981 

35 - 19,771 or a complementary sequence or a fragment thereof. 

9 
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In a fourth aspect of the invention, there is 
provided a single exon nucleic acid probe for measuring 
human gene expression in a sample derived from human heart 
which is a nucleic acid molecule having a sequence encoding 
5 a peptide comprising a peptide sequence as set out in any 
of SEQ ID NOs.: 19,772 - 29,119 or a complementary sequence 
or a fragment thereof wherein said probe hybridizes at high 
stringency to a nucleic acid expressed in the human heart. 

Preferably, a single exon nucleic acid probe in 

10 accordance with the third or fourth aspects of the 

invention comprises between at least 15 and 50 contiguous 
nucleotides of said SEQ ID NO: . It is preferred that the 
single exon nucleic acid probe comprises at least 15, 
suitably at least 20, more suitably at least 25 or 

15 preferably at least 50 contiguous nucleotides of said SEQ 
ID NO: . 

In another preferred embodiment, a single exon 
nucleic acid probe in accordance with the third or fourth 
aspects of the invention is between 3kb and 25kb in length. 

20 It is preferred that said probe is no more than 3kb, 

suitably no more than 5kb, more suitably no more than lOkb, 
preferably 15kb, more preferably 20kb or, most preferably, 
no more than 20kb in length. 

Preferably, a single exon nucleic acid probe in 

25 accordance with either the fifth or sixth aspect of the 
invention is DNA, preferably single-stranded DNA, RNA or 
PNA. 

In another embodiment of either the third or 
fourth aspect of the invention, a single exon nucleic acid 

30 probe is detectably labeled. Suitable detectable labels 
include a radionuclide, a fluorescent label or a first 
member of a specific binding pair. Suitable fluorescent 
labels include dyes such as cyanine dyes, preferably Cy3 
and Cy5 although other suitable dyes will be known to those 

35 skilled in the art. 
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In a particularly preferred embodiment, a single 
exon nucleic acid probe in accordance with either the third 
or fourth aspect of the invention lacks prokaryotic and 
bacteriophage vector sequence. In yet another embodiment, a 
5 single exon nucleic acid probe in accordance with either 
the third or fourth aspect of the invention lacks 
homopolymeric stretches of A or T. 

In a fifth aspect of the invention, there is 
provided an amplifiable nucleic acid composition, 
10 comprising: 

the single exon nucleic acid probe in accordance 
with either of the third or fourth aspects of the 
invention; and at least one nucleic acid primer; 

wherein said at least one primer is sufficient to 
15 prime enzymatic amplification of said probe. 

In an sixth aspect of the invention, there is 
provided a method of measuring gene expression in a sample 
derived from human heart, comprising: 

contacting the single exon microarray in 
20 accordance with the second aspect of the invention, with a 
first collection of detectably labeled nucleic acids, said 
first collection of nucleic acids derived from mRNA of 
human heart; and then 

measuring the label detectably bound to each 
25 probe of said microarray. 

In a seventh aspect of the invention, there is 
provided a method of identifying exons in a eukaryotic 
genome, comprising: 

algorithmically predicting at least one exon from 
30 genomic sequence of said eukaryote; and then 

detecting specific hybridization of detectably 
labeled nucleic acids to a single exon probe, 

wherein said detectably labeled nucleic acids are 
derived from mRNA from the heart of said eukaryote, said 
35 probe is a single exon probe having a fragment identical in 
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sequence to, or complementary in sequence to, said 
predicted exon, said probe is included within a single exon 
microarray in accordance with the first aspect of the 
invention, and said fragment is selectively hybridizable at 
5 high stringency. 

In a eighth aspect of the invention, there is 
provided a method of assigning exons to a single gene, 
comprising: 

identifying a plurality of exons from genomic 
10 sequence in accordance with the seventh aspect of the 
invention; and then 

measuring the expression of each of said exons in 
a plurality of tissues and/or cell types using 
hybridization to single exon microarrays having a probe 
15 with said exon, 

wherein a common pattern of expression of said 
exons in said plurality of tissues and/or cell types 
indicates that the exons should be assigned to a single 
gene . 

20 In an ninth aspect of the invention, there is 

provided a nucleic acid sequence as set out in any of SEQ 
ID NOs: 1 - 19,771 wherein said sequence encodes a peptide. 

In a tenth aspect of the invention, there is 
provided a peptide encoded by a sequence comprising a 

25 sequence as set out in any of SEQ ID NOs: 9,981 - 19,771, 
or a complementary sequence or coding portion thereof. 

In a preferred embodiment, a peptide may be 
encoded by a sequence comprising a sequence set out in any 
of SEQ ID NOS. : 1 - 9, 980. 

30 In a further aspect, the . invention provides 

peptides comprising an amino acid sequence translated from 
the DNA fragments, said amino acid sequences comprising SEQ 
ID NOS. : 9, 981 - 19,771. 

Accordingly in a eleventh aspect of the invention 

35 there is provided a peptide comprising a sequence as set 
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out in any 'of SEQ ID NOs : 19,772 - 29,119, or fragment 
thereof. 

In another aspect, the invention provides means 
for displaying annotated sequence, and in particular, for 
5 displaying sequence annotated according to the methods and 
apparatus of the present invention. Further, such display 
can be used as a preferred graphical user interface for 
electronic search, query, and analysis of such annotated 
sequence . 



Detailed Description of the Invention 



Definitions 

15 As used herein, the term "microarray" and phrase 

"nucleic acid microarray" refer to. a substrate-bound 
collection of plural nucleic acids, hybridization to each 
of the plurality of bound nucleic acids being separately 
detectable. The substrate can be solid or porous, planar 

20 or non-planar, unitary or distributed. 

As so defined, the term "microarray" and phrase 
"nucleic acid microarray" include all the devices so called 
in Schena (ed.) , DNA Microarrays : A Practical Approach 
(Practical Approach Series ), Oxford University Press (1999) 

25 (ISBN: 0199637768); Nature Genet. 21 (1) (suppl) :1 - 60 
(1999); and Schena (ed.), Microarray Biochip: Tools and 
Technology , Eaton Publishing Company/BioTechniques Books 
Division (2000) (ISBN: 1881299376). As so defined, the 
term "microarray" and phrase "nucleic acid microarray" 

30 further include substrate-bound collections of plural 

nucleic acids in which the nucleic acids are distributably 
disposed on a plurality of beads, rather than on a unitary 
planar substrate, as is described, inter alia, in Brenner 
et al., Proc. Natl. Acad. Sex. USA 97 ( 4 ): 16650167 0 (2000); 

35 in such case, the term "microarray" and phrase "nucleic 
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acid microarray" refer to the plurality of beads in 
aggregate . 

As used herein with respect to a nucleic acid 
microarray, the term "probe" refers to the nucleic acid 
5 that is, or is intended to be, bound to the substrate; in 
such context, the term "target" thus refers to nucleic acid 
intended to be bound thereto by Watson-Crick 
complementarity. As used herein with respect to solution 
phase hybridization, the term "probe" refers to the nucleic 
10 acid of known sequence that is detectably labeled. 

As used herein, the expression "probe comprising 
SEQ ID NO.", and variants thereof, intends a nucleic acid 
probe, at least a portion of which probe has, either (i) the 
sequence directly as given in the referenced SEQ ID NO., or 
15 (ii) a sequence complementary to the sequence as given in 
the referenced SEQ ID NO., the choice as between sequence 
directly as given and complement thereof dictated by the 
requirement that the probe hybridize to mRNA. 

As used herein, the term "open reading frame" and 
20 the equivalent acronym "ORF" refer to that portion of an 
exon that can be translated in its entirety into a sequence 
of contiguous amino acids i.e. a nucleic acid sequence 
that, in at least one reading frame, does not possess stop 
codons; the term does not require that the ORF encode the 
25 entirety of a natural protein. 

As used herein, the term "amplicon" refers to a 
PCR product amplified from human genomic DNA, containing 
the predicted exon. 

As used herein the term "exon" refers to the 
30 consensus prediction of the various exon and gene 
predicting algorithms i.e. a nucleic acid sequence 
bioinf ormatically predicted to encode a portion of a 
natural protein. 

As used herein, the term "peptide" refers to a 
35 sequence of amino acids. The sequences referred to as 
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PEPTIDE SEQ ID NOS . : are the predicted peptide sequences 
that would be translated from one of the exons, or a 
portion thereof set out in exon SEQ ID NOS.:. The codons 
encoding the peptide are wholly contained within the exon. 
5 As used herein, a "portions" of a defined 

nucleotide sequence or sequences can be and, preferably, 
are fragments unique to that sequence or to one or a 
combination of those sequences. A fragment unique to a 
nucleic acid molecule is one that is a signature for the 

10 larger nucleic acid molecule. 

As used herein, the phrase "expression of a 
probe" and its linguistic variants means that the ORF 
present within the probe, or its complement, is present 
within a target mRNA. 

15 As used herein, "stringent conditions" refers to 

parameters well known to those skilled in the art. When a 
nucleic acid molecule is said to be hybridisable to another 
of a given sequence under "stringent conditions" it is 
meant that it is homologous to the given sequence. 

20 As used herein, the phrase "specific binding 

pair" intends a pair of molecules that bind to one another 
with high specificity. Binding pairs are said to exhibit 
specific binding when they exhibit avidity of at least 10 7 , 
preferably at least 10 8 , more preferably at least 10 9 

25 liters/mole. Nonlimiting examples of specific binding 
pairs are: antibody and antigen; biotin and avidin; and 
biotin and streptavidin . 

As used herein with respect to the visual display 
of annotated genomic sequence, the term "rectangle" means 

30 any geometric shape that has at least a first and a second 
border, wherein the first and second borders each are 
capable of mapping uniquely to a point of another visual 
object of the display. 

As used herein, a "Mondrian" means a visual 

35 display in which a single genomic sequence is annotated 
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with predicted and experimentally confirmed functional 
information . 



5 Brief Description of the Drawings 

The present invention is further illustrated with 
reference to the following non-limiting figures and 
examples in which: 

10 FIG. 1 illustrates a process for predicting 

functional regions from genomic sequence, confirming the 
functional activity of such regions experimentally, and 
associating and displaying the data so obtained in 
meaningful and useful relationship to the original sequence 

15 data; 

FIG. 2 further elaborates that portion of the 
process schematized in FIG. 1 for predicting functional 
regions from genomic sequence; 

FIG. 3 illustrates a Mondrian visual display; 
20 FIG. 4 presents a Mondrian showing a hypothetical 

annotated genomic sequence; 

FIG. 5 is a histogram showing the distribution of 
ORF length and PCR products as obtained, with ORF length 
shown in black and PCR product length shown in dotted 
25 lines; 

FIG. 6 is a histogram showing the distribution, 
among exons predicted according to the methods described, 
of expression as measured using simultaneous two color 
hybridization to a genome-derived single exon microarray. 

30 The graph shows the number of sequence-verified products 
that were either not expressed ("0"), expressed in one or 
more but not all tested tissues ("1" - "9"), or expressed 
in all tissues tested ("10"); 

FIG. 7 is a pictorial representation of the 

35 expression of verified sequences that showed expression 
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with signal intensity greater than 3 in at least one 
tissue, with: FIG. 7A showing the expression as measured by 
microarray hybridization in each of the 10 measured 
tissues, and the expression as measured "bioinf ormatically" 
5 by query of EST, NR and SwissProt databases; with FIG. 7B 
showing the legend for display of physical expression 
(ratio) in FIG. 7A; and with FIG. 7C showing the legend for 
scoring EST hits as depicted in FIG. 7A; 

FIG. 8 shows a comparison of normalized CY3 

10 signal intensity for arrayed sequences that were identical 
to sequences in existing EST, NR and SwissProt databases or 
that were dissimilar (unknown) , where black denotes the 
signal intensity for all sequence-verified products with a 
BLAST Expect ("E") value of greater than le-30 (1 x 10" 30 ) 

15 ("unknown") and a dotted line denotes sequence-verified 

spots with a BLAST expect ("E") value of less than le-30 (1 
x 10~ 30 ) ("known") ; 

FIG. 9 presents a Mondrian of BAC AC008172 (bases 
25,000 to 130,000), containing the carbamyl phosphate 

20 synthetase gene (AF154830 . 1) ; and 

FIG. 10 is a Mondrian of BAC A049839. 



Methods and Apparatus for Predicting, Confirming, 
25 Annotating, and Displaying Functional Regions From Genomic 
Sequence Data 

FIG. 1 is a flow chart illustrating in broad 
outline a process for predicting functional regions from 
30 genomic sequence, confirming and characterizing the 

functional activity of such regions experimentally, and 
then associating and displaying the information so obtained 
in meaningful and useful relationship to the original 
sequence data. 

35 The initial input into process 10 of the present 
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invention is drawn from one or more databases 100 
containing genomic sequence data. Because genomic sequence 
is usually obtained from subgenomic fragments, the sequence 
data typically will be stored in a series of records 
5 corresponding to these subgenomic sequenced fragments. 
Some fragments will have been catenated to form larger 
contiguous sequences ( "contigs") ; others will not. A 
finite percentage of sequence data in the database will 
typically be erroneous, consisting inter alia of vector 

10 sequence, sequence created from aberrant cloning events, 
sequence of artificial polylinkers, and sequence that was 
erroneously read. 

Each sequence record in database 100 will 
minimally contain as annotation a unique sequence 

15 identifier (accession number) , and will typically be 
annotated further to identify the date of accession, 
species of origin, and depositor. Because database 100 can 
contain nongenomic sequence, each sequence will typically 
be annotated further to permit query for genomic sequence. 

20 Chromosomal origin, optionally with map location, can also 
be present. Data can be, and over time increasingly will 
be, further annotated with additional information, in part 
through use of the present invention, as described below. 
Annotation can be present within the data records, in 

25 information external to database 100 and linked to the 
records thereto, or through a combination of the two. 

Databases useful as genomic sequence database 100 
in the present invention include GenBank, and particularly 
include several divisions thereof, including the 

30 htgs (draft), NT (nucleotide, command line), and NR 

(nonredundant ) divisions. GenBank is produced by the 
National Institutes of Health and is maintained by the 
National Center for Biotechnology Information (NCBI) . 
Databases of genomic sequence from species other than 

35 human, such as mouse, rat, Arabidopsis, C. elegans, C. 
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brigsii, Drosophila, zebra fish, and other higher 
eukaryotic organisms will also prove useful as genomic 
sequence database 100. 

Genomic sequence obtained by query of genomic 
5 sequence database 100 is then input into one or more 

processes 200 for identification of regions therein that 
are predicted to have a biological function as specified by 
the user. Such functions include, but are not limited to, 
encoding protein, regulating transcription, regulating 

10 message transport after transcription into mRNA, regulating 
message splicing after transcription into mRNA, of 
regulating message degradation after transcription into 
mRNA, and the like. Other functions include directing 
somatic recombination events, contributing to chromosomal 

15 stability or movement, contributing to allelic exclusion or 
X chromosome inactivation, and the like. 

The particular genomic sequence to be input into 
process 200 will depend upon the function for which 
relevant sequence is to be identified as well as upon the 

20 approach chosen for such identification. Process step 200 
can be iterated to identify different functions within a 
given genomic region. In such case, the input often will 
be different for the several iterations. 

Sequences predicted to have the requisite 

25 function by process 200 are then input into process 300, • 
where a subset of the input sequences suitable for 
experimental confirmation is identified. Experimental 
confirmation can involve physical and/or bioinf ormatic 
assay. Where the subsequent experimental assay is 

30 bioinf ormatic, rather than physical, there are fewer 

constraints on the sequences that can be tested, and in 
this latter case therefore process 300 can output the 
entirety of the input sequence. 

The subset of sequences output from process 300 

35 is then used in process 400 for experimental verification 
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and characterization of the function predicted in 

process 200, which experimental verification can, and often 

will, include both physical and bioinf ormatic assay. 

Process 500 annotates the sequence data with the 
5 functional information obtained in the physical and/or 
bioinf ormatic assays of process 400. 'Such annotation can 
be done using any technique that usefully relates the 
functional information to the sequence, as, for example, by 
incorporating the functional data into the sequence data 

10 record itself, by linking records in a hierarchical or 

relational database, by linking to external databases, by a 
combination thereof, or by other means well known within 
the database arts. The data can even be submitted for 
incorporation into databases maintained by others, such as 

15 GenBank, which is maintained by NCBI. 

As further noted in FIG. 1, additional annotation 
can be input into process 500 from external sources 600. 

The annotated data is then displayed in process 
800, either before, concomitantly with, or after optional 

20 storage 700 on nontransient media, such as magnetic disk, 
optical disc, magnetcoptical disk, flash memory, or the 
like. 

FIG. 1 shows that the experimental data output 
from process 400 can be used in each preceding step of 

25 process 10: e.g., facilitating identification of functional 
sequences in process 200, facilitating identification of an 
experimentally suitable subset thereof in process 300, and 
facilitating creation of physical and/or informational 
substrates for, and performance of subsequent assay, of 

30 functional sequences in process 400. 

Information from each step can be passed directly 
to the succeeding process, or stored in permanent or 
interim form prior to passage to the succeeding process. 
Often, data will be stored after each, or at least a 

35 plurality, of such process steps. Any or all process steps 
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can be automated. 

FIG. 2 further elaborates the prediction of 
functional sequence within genomic sequence according to 
process 200. 

5 Genomic sequence database 100 is first queried 20 

for genomic sequence. 

The sequence required to be returned by query 2 0 
will depend, in the first instance, upon the function to be 
identified. 

10 For example, genomic sequences that function to 

encode protein can be identified inter alia using gene 
prediction approaches, comparative sequence analysis 
approaches, or combinations of the two. In gene prediction 
analysis, sequence from one genome is input into process 

15 200 where at least one, preferably a plurality, of 

algorithmic methods are applied to identify putative coding 
regions. In comparative sequence analysis, by contrast, 
corresponding, e.g., syntenic, sequence from a plurality of 
sources, typically a plurality of species, is input into 

20 process 200, where at least one, possibly a plurality, of 
algorithmic methods are applied to compare the sequences 
and identify regions of least variability. 

The exact content of query 20 will also depend 
upon the database queried. For example, if the database 

25 contains both genomic and nongenomic sequence, perhaps 
derived from multiple species, and the function to be 
determined is protein coding regions in human genomic 
sequence, the query will accordingly require that the 
sequence returned be genomic and derived from humans. 

30 Query 20 can also incorporate criteria that 

compel return of sequence that meets operative requirements 
of the subsequent analytical method. Alternatively, or in 
addition, such operative criteria can be enforced in 
subsequent preprocess step 24. 

35 For example, if the function sought to be 
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identified is protein coding, query 20 can incorporate 
criteria that return from genomic sequence database 100 
only those sequences present within contigs sufficiently 
long as to have obviated substantial fragmentation of any 
5 given exon among a plurality of separate sequence 
fragments . 

Such criteria can, for example, consist of a 
required minimal individual genomic sequence fragment 
length, such as 10 kb, more typically 20 kb, 30 kb, 40kb, 

10 and preferably 50 kb or more, as well as an optional 

further or alternative requirement that sequence from any 
given clone, such as a bacterial artificial chromosome 
( "BAC" )' , be presented in no more than a finite maximal 
number of fragments, such as no more than 20 separate 

15 pieces, more typically no more than 15 fragments, even more 
typically no more than about 10 - 12 fragments. 

Results using the present invention have shown 
that genomic sequence from bacterial artificial chromosomes 
(BACs) is sufficient for gene prediction analysis according 

20 to the present invention if the sequence is at least 50 kb 
in length, and if additionally the sequence from any given 
BAC is presented in fewer than 15, and preferably fewer 
than 10, fragments. Accordingly, query 20 can incorporate 
a requirement that data accessioned from BAC sequencing be 

25 in fewer than 15, preferably fewer than 10, fragments. 

An additional criterion that can be incorporated 
into the query can be the date, or range of dates, of 
sequence accession. Although the process has been 
described above as if genomic sequence database 100 were 

30 static, it is of course understood that the genomic 
sequence databases need not be static, and indeed are 
typically updated on a frequent, even hourly, basis. Thus, 
as further described in Examples 1 and 2, infra, it is 
possible to query the database for newly added sequence, 

35 either newly added after an absolute date, or newly added 

22 



WO 01/57274 PCT/US01/00666 

relative to a prior analysis performed using the methods 
and apparatus of the present invention. In this way, the 
process herein described can incorporate a dynamic, 
temporal component. 
5 One utility of such temporal limitation is to 

identify, from newly accessioned genomic sequence, the 
presence of novel genes, particularly those not previously 
identified by EST sequencing (or other sequencing efforts 
that are similarly based upon gene expression) . As further 

10 described in Example 1, such an approach has shown that 

newly accessioned human genomic sequence, when analyzed for 
sequences that function to encode protein, readily 
identifies genes that are novel over those in existing EST 
and other expression databases. This makes the methods of 

15 the present invention extremely powerful gene discovery 
tools. And as would be appreciated, such gene discovery 
can be performed using genomic sequence from species other 
than human. 

If query 20 incorporates multiple criteria, such 
20 as above-described, the multiple criteria can be performed 
as a series of separate queries or as a single query, 
depending in part upon the query language, the complexity 
of the query, and other considerations well known in the 
database arts. 

25 If query 20 returns no genomic sequence meeting 

the query criteria, the negative result can be reported by 
process 22, and process 200 (and indeed, entire process 10) 
ended 23, as shown. Alternatively, or in addition to 
report and termination of the initial inquiry, a new query 

30 20 can be generated that takes into account the initial 
negative result. 

When query 2 0 returns sequence meeting the query 
criteria, the returned sequence is then passed to optional 
preprocessing 2,4, suitable and specific for the desired 

35 analytical approach and the particular analytical methods 



23 



WO 01/57274 PCT/US01/00666 

thereof to be used in process 25. 

Preprocessing 24 can include processes suitable 
for many approaches and methods thereof, as well as 
processes specifically suited for the intended subsequent 
5 analysis. 

Preprocessing 24 suitable for most approaches and 
methods will include elimination of sequence irrelevant to, 
or that would interfere with, the subsequent analysis. 
Such sequence includes repetitive sequence, such as Alu 

10 repeats and LINE elements, vector sequence, artificial 
sequence, such as artificial polylinkers, and the like. 
Such removal can readily be performed by identification and 
subsequent masking of the undesired sequence. 

Identification can be effected by comparing the 

15 genomic sequence returned by query 20 with public or 
private databases containing known repetitive sequence, 
vector sequence, artificial sequence, and other artifactual 
sequence. Such comparison can readily be done using 
programs well known in the art, such as CROSS_MATCH, or by 

20 proprietary sequence comparison programs the engineering of 
which is well within the skill in the art. 

Alternatively, or in addition, undesirable, 
including artifactual, sequence can be identified 
algorithmically without comparison to external databases 

25 and thereafter removed. For example, synthetic polylinker 
sequence can be identified by an algorithm that identifies 
a significantly higher than average density of known 
restriction sites. As another example, vector sequence can 
be identified by algorithms that identify nucleotide or 

30 codon usage at variance with that of the bulk of the 
genomic sequence. 

Once identified, undesired sequence can be 
removed. Removal can usefully be done by masking the 
undesired sequence as, for example, by converting the 

35 specific nucleotide references to one that is unrecognized 
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by the subsequent bioinformatic algorithms, such as "X". 
Alternatively, but at present less preferred, the undesired 
sequence can be excised from the returned genomic sequence, 
leaving gaps. 

5 Preprocessing 24 can further include selection 

from among duplicative sequences of that one sequence of 
highest quality. Higher quality can be measured as a lower 
percentage of, fewest number of, or least densely clustered 
occurrence of ambiguous nucleotides, defined as those 
10 nucleotides that are identified in the genomic sequence 
using symbols indicating ambiguity. Higher quality can 
also or alternatively be valued by presence in the longest 
contig. 

Preprocessing 24 can, and often will, also 

15 include formatting of the data as specifically appropriate 
for passage to the analytical algorithms of process 25. 
Such formatting can and typically will include, inter alia, 
addition of a unique sequence identifier, either derived 
from the original accession number in genomic sequence 

20 database 100, or newly applied, and can further include 
additional annotation. Formatting can include conversion 
from one to another sequence listing standard, such as 
conversion to or from FAST A or the like, depending upon the 
input expected by the subsequent process. 

25 Preprocessing, which can be optional depending 

upon the function desired to be identified and the 
informational requirements of the methods for effecting 
such identification, is followed by sequence processing 25, 
where sequences with the desired function are identified 

30 within the genomic sequence. 

As mentioned above, such functions can include, 
but are not limited to, encoding protein, regulating 
transcription, regulating message transport after 
transcription into mRNA, regulating message splicing after 

35 transcription, of regulating message degradation, and the 
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like. Other functions include directing somatic 
recombination events, contributing to chromosomal stability 
or movement, contributing to allelic exclusion or X 
chromosome inactivation, or the like. 
5 The methods of the present invention are 

particularly useful for gene discovery, that is, for 
identifying, from genomic sequence, regions that function 
to encode genes, and in a particularly useful embodiment, 
for identifying regions that function to encode genes not 

10 hitherto identified by expression-based or directed cloning 
and sequencing. In conjunction with verification using the 
novel single exon microarrays of the present invention, as 
further described below, the methods herein described 
become powerful gene discovery tools. 

15 Accordingly, in a preferred embodiment of the 

present invention, process 25 is used to identify putative 
coding regions. Two preferred approaches in process 25 for 
identifying sequence that encodes putative genes are gene 
prediction and comparative sequence analysis. 

20 Gene prediction can be performed using any of a 

number of algorithmic methods, embodied in one or more 
software programs, that identify open reading frames (ORFs) 
using a variety of heuristics, such as GRAIL, DICTION, and 
GENEFINDER. Comparative sequence analysis similarly can be 

25 performed using any of a variety of known programs that 
identify regions with lower sequence variability. 

As further described in Example 1, below, gene 
finding software programs yield a range of results. For 
the newly accessioned human genomic sequence input in 

30 Example 1, for example, GRAIL identified the greatest 

percentage of genomic sequence as putative coding region, 
2% of the data analyzed; GENEFINDER was second, calling 1%; 
and DICTION yielded the least putative coding region, with 
0.8% of genomic sequence called as coding region. 

35 Increased reliability can be obtained when 
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consensus is required among several such methods. Although 
discussed herein particularly with respect to exon calling, 
consensus among methods will in general increase 
reliability of predicting other functions as well. 
5 Thus, as indicated by query 2 6, sequence 

processing 25, optionally with preprocessing 24, can be 
repeated with a different method, with consensus among such 
iterations determined and reported in process 27. 

Process 27 compares the several outputs for a 

10 given input genomic sequence and identifies consensus among 
the separately reported results. The consensus itself, as 
well as the sequence meeting that consensus, is then stored 
in process 29a, displayed in process 29b, and/or output to 
process 300 for subsequent identification of a subset 

15 thereof suitable for assay. 

Multiple levels of consensus can be calculated 
and reported by process 27. For example, as further 
described in Example 1, infra, process 27 can report 
consensus as between all specific pairs of methods of gene 

20 prediction, as consensus among any one or more of the pairs 
of methods of gene prediction, or as among all of the gene 
prediction algorithms used. Thus, in Example 1, process 27 
reported that GRAIL and GENEFINDER programs agreed on 0.7% 
of genomic sequence, that GRAIL and DICTION agreed on 0.5% 

25 of genomic sequence, and that the three programs together 
agreed on 0.25% of the data analyzed. Put another way, 
0.25% of the genomic sequence was identified by all three 
of the programs as containing putative coding region. 

Furthermore, consensus can be required among 

30 different approaches to identifying a chosen function. 

For example, if the function desired to be 
identified is coding of protein sequence, and a first used 
approach to exon calling is gene prediction, the process 
can be repeated on the same input sequence, or subset 

35 thereof, with another approach, such as comparative 
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sequence analysis. In such a case, where comparative 
sequence analysis follows gene prediction, the comparison 
can be performed not only on genomic nucleic acid sequence, 
but additionally or alternatively can be performed on the 
5 predicted amino acid sequence translated from the ORFs 
prior identified by the gene prediction approach. 

Although shown as an iterative process, the 
multiple analyses required to achieve consensus can be done 
in series, in parallel, or some combination thereof. 
10 Predicted functional sequence, optionally 

representing a consensus among a plurality of methods and 
approaches for determination thereof, is passed to process 
300 for identification of a subset thereof for functional 
assay. 

15 In the preferred embodiment of the methods of the 

present invention, wherein the function sought to be 
identified is protein coding, process 300 is used to 
identify a subset thereof suitable for experimental 
verification by physical and/or bioinf ormatic approaches. 

20 For example, putative ORFs identified in process 

200 can be classified, or binned, bioinf ormatically into 
putative genes. This binning can be based inter alia upon 
consideration of the average number of exons/gene in the 
species chosen for analysis, upon density of exons that 

25 have been called on the genomic sequence, and other 

empirical rules. Thereafter, one or more among the gene- 
specific ORFs can be chosen for subsequent use in gene 
expression assay. 

Where such subsequent gene expression assay uses 

30 amplified nucleic acid, considerations such as desired 
amplicon length, primer synthesis requirements, putative 
exon length, sequence GC content, existence of possible 
secondary structure, and the like can be used to identify 
and select those ORFs that appear most likely successfully 

35 to amplify. Where subsequent gene expression assay relies 
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upon nucleic acid hybridization, whether or not using 
amplified product, further considerations involving 
hybridization stringency can be applied to identify that 
subset of sequences that will most readily permit sequence- 
5 specific discrimination at a chosen hybridization and wash 
stringency. One particular such consideration is avoidance 
of putative exons that span repetitive sequence; such 
sequence can hybridize spuriously to nonspecific message, 
reducing specific signal in the hybridization. 

10 For bioinformatic assay, there are fewer 

constraints on the sequences that can be tested 
experimentally, and in this latter case therefore process 
300 can output the entirety of the input sequence. 

The subset of sequences identified by process 300 

15 as suitable for use in assay is then used in process 400 to 
create the physical and/or informational substrate for 
experimental verification of the predictions made in 
process 200, and thereafter to assay those substrates. 
As mentioned, the methods of the present 

20 invention are particularly useful for identifying potential 
coding regions within genomic sequence. In a preferred 
embodiment of process 400, therefore, the expression of the 
sequences predicted to encode protein is verified. The 
combination of the predictive and experimental methods 

25 provides a powerful gene discovery engine. 

Thus, in another aspect, the present invention 
provides methods and apparatus for verifying the expression 
of putative genes identified within genomic sequence. In 
particular, the invention provides a novel method of 

30 verifying gene expression in which expression of predicted 
ORFs is measured and confirmed using a novel type of 
nucleic acid microarray, the genome-derived single exon 
nucleic acid microarrays of the present invention. 

Putative ORFs as predicted by a consensus of gene 

35 calling, particularly gene prediction, algorithms in 
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process 200, and as further identified as suitable by 
process 300, are amplified from genomic DNA using the 
polymerase chain reaction (PCR) . Although PCR is 
conveniently used, other amplification approaches can also 
5 be used. 

Amplification schemes can be designed to capture 
the entirety of each predicted ORF in an amplicon with 
minimal additional (that is, intronic or intergenic) 
sequence. Because ORFs predicted from human genomic 
10 sequence using the methods of the present invention differ 
in length, such an approach results in amplicons of varying 
length. 

However, most predicted ORFs are shorter than 500 
bp in length, and although amplicons of at least about 100 

15 or 200 base pairs can be immobilized as probes on nucleic 
acid microarrays, early experimental results using the 
methods of the present invention have suggested that longer 
amplicons, at least about 400 or 500 base pairs, are more 
■ effective. Furthermore, certain advantages derive from 

20 application to the microarray of amplicons of defined size. 
Therefore, amplification schemes can 
alternatively, and preferably, be designed to amplify 
regions of defined size, preferably at least about 300, 400 
or 500 bp, centered about each predicted ORF. Such an 

25 approach results in a population of amplicons of limited 
size diversity, but that typically contain intronic and/or 
intergenic nucleic acid in addition to putative ORF. 

Conversely, somewhat fewer than 10% of ORFs 
predicted from human genomic sequence according to the 

30 methods of the present invention exceed 500 bp in length. 
Portions of such extended ORFs, preferably at least about 
300,400 or 500 bp in length, can be amplified. However, it 
has been discovered that the percentage success at 
amplifying pieces of such ORFs is low, and that such 

35 putative exons are more effectively amplified when larger 
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fragments, at least about 1000 or 1500 bp, and even as 
large as 2000 bp are amplified. 

The putative ORFs selected in process 300 are 
thus input into one or more primer design programs, such as 
5 PRIMER3 (available online for use at 

http://www-genome.wi.mit.edu/cgi-bin/primer/ ), with a goal 
of amplifying at least about 500 base pairs of genomic 
sequence centered within or about ORFs predicted to be no 
more than about 500 bp, or at least about 1000 - 1500 bp of 

10 genomic sequence for ORFs predicted to exceed 500 bp in 

length, and the primers synthesized by standard techniques. 
Primers with the requisite sequences can be purchased 
commercially or synthesized by standard techniques. 

Conveniently, a first predetermined sequence can 

15 be added commonly to the ORF-specific 5' primer and a 
second, typically different, predetermined sequence 
commonly added to each 3' ORF-unique primer. This serves 
to immortalize the amplicon, that is, serves to permit 
further amplification of any amplicon using a single set of 

20 primers complementary respectively to the common 5' and 
common 3' sequence elements. The presence of these 
"universal" priming sequences further facilitates later 
sequence verification, providing a sequence common to all 
amplicons at which to prime sequencing reactions. The 

25 common 5' and 3' sequences further serve to add a cloning 
site should any of the ORFs warrant further study. 

Such predetermined sequence is usefully at least 
about 10, 12 or 15 nt in length, and usually does not 
exceed about 25 nt in length. The "universal" priming 

30 sequences used in the examples presented infra were each 16 
nt long. 

The genomic DNA to be used as substrate for 
amplification will come from the eukaryotic species from 
which the genomic sequence data had originally been 
35 obtained, or a closely related species, and can 
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conveniently be prepared by well known techniques from 
somatic or germline tissue or cultured cells of the 
organism. See, e.g., Short Protocols in Molecular Biology 
: A Compendium of Methods from Current Protocols in 
5 Molecular Biology , Ausubel et al. (eds.), 4 th edition 
(April 1999), John Wiley & Sons (ISBN: 047132938X) and 
Maniatis et al., Molecular Cloning : A Laboratory Manual , 
2 nd edition (December 1989), Cold Spring Harbor Laboratory 
Press (ISBN: 0879693096). Many such prepared genomic DNAs 
10 are available commercially, with the human genomic DNAs 
additionally having certification of donor informed 
consent . 

Although the intronic and intergenic material 
flanking putative coding regions in the amplicons could 

15 potentially interfere with hybridizations during microarray 
experiments, we have found, surprisingly, that differential 
expression ratios are not significantly affected. Rather, 
the predominant effect of exon size is to alter the 
absolute signal intensity, rather than its ratio. Equally 

20 surprising, the art had suggested that single exon probes 
would not provide sufficient signal intensity for high 
stringency hybridization analyses; we find that such probes 
not only provide adequate signal, but have substantial 
advantages, as herein described. 

25 After partial purification, as by size exclusion 

spin column, with or without confirmation as to amplicon 
quality as by gel electrophoresis, each amplicon (single 
exon probe) is disposed in an array upon a support 
substrate . 

30 Methods for creating microarrays by deposition 

and fixation of nucleic acids onto support substrates are 
well known in the art (Reviewed by Schena et al . , see 
above) . 

Typically, the support substrate will be glass, 
35 although other materials, such as amorphous or crystalline 
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silicon or plastics. Such plastics include 
polymethylacrylic, polyethylene, polypropylene, 
polyacrylate, polymethylmethacrylate, polyvinylchloride, 
polytetraf luoroethylene, polystyrene, polycarbonate, 
5 polyacetal, polysulfone, celluloseacetate, 

cellulosenitrate, nitrocellulose, or mixtures thereof, can 
also be used. Typically, the support will be rectangular, 
although other shapes, particularly circular disks and even 
spheres, present certain advantages. Particularly 

10 advantageous alternatives to glass slides as support 

substrates for array of nucleic acids are optical discs, as 
described in WO 98/12559. 

The amplified nucleic acids can be attached 
covalently to a surface of the support substrate or, more 

15 typically, applied to a derivatized surface in a chaotropic 
agent that facilitates denaturation and adherence by 
presumed noncovalent interactions, or some combination 
thereof . 

Robotic spotting devices useful for arraying 

20 nucleic acids on support substrates can be constructed 
using public domain specifications (The MGuide, version 
2.0, http://cmgm.stanford.edu/pbrown/mguide/index.html), or 
can conveniently be purchased from commercial sources 
(MicroArray Genii Spotter and MicroArray Genlll Spotter, 

25 Molecular Dynamics, Inc., Sunnyvale, CA) . Spotting can 

also be effected by printing methods, including those using 
ink jet technology. 

As is well known in the art, microarrays 
typically also contain immobilized control nucleic acids. 

30 For controls useful in providing measurements of background 
signal for the genome-derived single exon microarrays of 
the present invention, a plurality of E . coli genes can 
readily be used. As further described in Example 1, 16 or 
32 E. coli genes suffice to provide a robust measure of 

35 background noise in such microarrays. 
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As is well known in the art, the amplified 
product disposed in arrays on a support substrate to create 
a nucleic acid microarray can consist entirely of natural 
nucleotides linked by phosphodiester bonds, or 
5 alternatively can include either nonnative nucleotides, 
alternative internucleotide linkages, or both, so long as 
complementary binding can be obtained in the hybridization. 
If enzymatic amplification is used to produce the 
immobilized probes, the amplifying enzyme will impose 

10 certain further constraints upon the types of nucleic acid 
analogs that can be generated. 

Although particularly described herein as using 
high density microarrays constructed on planar substrates, 
the methods of the present invention for confirming the 

1'5 expression of ORFs predicted from genomic sequence can use 
any of the known types of microarrays, as herein defined, 
including lower density planar arrays, and microarrays on 
nonplanar, nonunitary, distributed substrates. 

For example, gene expression can be confirmed 

20 using hybridization to lower density arrays, such as those 
constructed on membranes, such as nitrocellulose, nylon, 
and positively-charged derivatized nylon membranes. 
Further, gene expression can also be confirmed using 
nonplanar, bead-based microarrays such as are described in 

25 Brenner et al . , Proc. Natl. Acad. Sci . USA 97 ( 4) : 166501670 
(2000); U.S. Patent No. 6,057,107; and U.S. Patent No. 
5,736,330. In theory, a packed collection of such beads 
provides in aggregate a higher density of nucleic acid 
probe than can be achieved with spotting or lithography 

30 techniques on a single planar substrate. 

Planar microarrays on solid substrates, however, 
provide certain useful advantages, including high 
throughput and compatibility with existing readers. For 
example, each standard microscope slide can include at 

35 least 1000, typically at least 2000, preferably 5000 and 
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upto 10,000 - 50,000 or more nucleic acid probes of 
discrete sequence. The number of sequences deposited will 
depend on their required application. 

Each putative gene can be represented in the 
5 array by a single predicted ORF. Alternatively, genes can 
be represented by more than one predicted ORF. For 
purposes of measuring differential splicing, more than one 
predicted ORF will be provided for a putative gene. And as 
is well known in the art, each probe of defined sequence, 

10 representing a single predicted ORF, can be deposited in a 
plurality of locations on a single microarray to provide 
redundancy of signal. 

The genome-derived single exon microarrays 
described above differ in several fundamental and 

15 advantageous ways from microarrays presently used in the 
gene expression art, including (1) those created by 
deposition of mRNA- derived nucleic acids, (2) those created 
by in situ synthesis of oligonucleotide probes, and (3) 
those constructed from yeast genomic DNA. 

20 Most nucleic acid microarrays that are in use for 

study of eukaryotic gene expression have as immobilized 
probes nucleic acids that are derived — either directly or 
indirectly — from expressed message. As discussed above, 
it is common, for example, for such microarrays to be 

25 derived from cDNA/EST libraries, either from those 

previously described in the literature, see Lennon et al. r 
or from the de novo construction of "problem specific" 
libraries targeted at a particular biological question, 
R.S. Thomas et al., Cancer Res. (in press). Such 

30 microarrays are herein collectively denominated "EST 
microarrays" . 

Such EST microarrays by definition can measure 
expression only of those genes found in EST libraries, 
shown herein to represent only a fraction of expressed 

35 genes. Furthermore, such libraries - and thus microarrays 

35 
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based thereupon — are biased by the tissue or cell type of 
message origin, by the expression levels of the respective 
genes within the tissues, and by the ability of the message 
successfully to have been reverse-transcribed and cloned. 
5 Thus, as further discussed in Example 1, the 

methods of the present invention enable sequences that do 
not appear in EST or other expression databases to be 
determined - subsequently arrayed for expression 
measurements could not, therefore, have been represented as 

10 probes on an EST microarray. And as further demonstrated 
in the examples, infra, the remaining population of genes 
identified from genomic sequence by the methods of the 
present invention — that is, the one third of sequences 
that had previously been accessioned in EST or other 

15 expression databases - are biased toward genes with higher 
expression levels. 

Representation of a message in an EST and/or cDNA 
library depends upon the successful reverse transcription, 
optionally but typically with subsequent successful 

20 cloning, of the message. This introduces substantial bias 
into the population of probes available for arraying in EST 
microarrays . 

In contrast, neither reverse transcription nor 
cloning is required to produce the probes arrayed on the 

25 genome-derived single exon microarrays of the present 

invention. And although the ultimate deposition of a probe 
on the genome-derived single exon microarray of the present 
invention depends upon a successful amplification from 
genomic material, a priori knowledge of the sequence of the 

30 desired amplicon affords greater opportunity to recover any 
given probe sequence recalcitrant to amplification than is 
afforded by the requirement for successful reverse 
transcription and cloning of unknown message in EST 
approaches . 

35 Thus, the genome-derived single exon microarrays 
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of the present invention present a far greater diversity of 
probes for measuring gene expression, with far less bias, 
than do EST microarrays presently used in the art. 

As a further consequence of their ultimate origin 
5 from expressed message, the probes in EST microarrays often 
contain poly-A (or complementary poly-T) stretches derived 
from the poly-A tail of mature mRNA. These homopolymeric 
stretches contribute to cross-hybridization, that is, to a 
spurious signal occasioned by hybridization to the 

10 homopolymeric tail of a labeled cDNA that lacks sequence 
homology to the. gene-specific portion of the probe. 

In contrast, the probes arrayed in the genome- 
derived single exon microarrays of the present invention 
lack homopolymeric stretches derived from message 

15 polyadenylation, and thus can provide more specific signal. 
Typically, at least about 50, 60 or 75% of the probes on 
the genome-derived single exon microarrays of the present 
invention lack homopolymeric regions consisting of A or T, 
where a homopolymeric region is defined for purposes herein 

20 as stretches of 25 or more, typically 30 or more, identical 
nucleotides . 

A further distinction, which also affects the 
specificity of hybridization, is occasioned by the typical 
derivation of EST microarray probes from cloned material. 

25 Because much of the probe material disposed as probes on 
EST microarrays is excised or amplified from plasmid, 
phage, or phagemid vectors, EST microarrays typically 
include a fair amount of vector sequence, more so when the 
probes are amplified, rather than excised, from the vector. 

30 In contrast, the vast majority of probes in the 

genome-derived single exon microarrays of the present 
invention contain no prokaryotic or bacteriophage vector 
sequence, having been amplified directly or indirectly from 
genomic DNA. Typically, therefore, at least about 50, 60, 

35 70 or 80% or more of individual exon-including probes 
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disposed on a genome-derived single exon microarray of the 
present invention lack vector sequence, and particularly 
lack sequences drawn from plasmids and bacteriophage. 
Preferably, at least about 85, 90 or more than 90% of exon- 
5 including probes in the genome-derived single exon 

microarray of the present invention lack vector sequence. 
With attention to removal of vector sequences through 
preprocessing 24, percentages of vector-free exon-including 
probes can be as high as 95 - 99%. The substantial absence 

10 of vector sequence from the genome-derived single exon 
microarrays of the present invention results in greater 
specificity during hybridization, since spurious cross- 
hybridization to a probe vector sequence is reduced. 

As a further consequence of excision or 

15 amplification of probes from vectors in construction of EST 
microarrays, the probes arrayed thereon often contain 
artificial sequence, derived from vector polylinker 
multiple cloning sites, at both 5' and 3 1 ends. The probes 
disposed upon the genome-derived single exon microarrays 

20 need have no such artificial sequence appended thereto. 

As mentioned above, however, the ORF-specific 
primers used to amplify putative ORFs can include 
artificial sequences, typically 5' to the ORF-specific 
primer sequence, useful for "universal" (that is, 

25 independent of ORF sequence) priming of subsequent 
amplification or sequencing reactions. When such 
"universal" 5' and/or 3' priming sequences are appended to 
the amplification primers, the probes disposed upon the 
genome-derived single exon microarray will include 

30 artificial sequence similar to that found in EST 

microarrays. However, the genome-derived single exon 
microarray of the present invention can be made without 
such sequences, and if so constructed, presents an even 
smaller amount of nonspecific sequence that would 

35 contribute to nonspecific hybridization. 
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Yet another consequence of typical use of cloned 
material as probes in EST microarrays is that such 
microarrays contain probes that result from cloning 
artifacts, such as chimeric molecules containing coding 
5 region of two separate genes. Derived from genomic 

material, typically not thereafter cloned, the probes of 
the genome-derived single exon microarrays of the present 
invention lack such cloning artifacts, and thus provide 
greater specificity of signal in gene expression 

10 measurements. 

A further consequence of the cloned origin of 
probes on many EST microarrays is that the individual 
probes often have disparate sizes, which can cause the 
optimal hybridization stringency to vary among probes on a 

15 single microarray. In contrast, as discussed above, the 
probes arrayed on the genome-derived single exon 
microarrays of the present invention can readily be 
designed to have a narrow distribution in sizes, with the 
range of probe sizes no greater than about 10% of the 

20 average size, typically no greater than about 5% of the 
average probe size. 

Because of their origin from fully- or partially- 
spliced message, probes disposed upon EST arrays will often 
include multiple exons . The percentage of such exon- 

25 spanning probes in an EST microarray can be calculated, on 
average, based upon the predicted number of exons/gene for 
the given species and the average length of the immobilized 
probes. For human genes, the near-complete sequence of 
human chromosome 22, Dunham et al . , Nature 402 (6761) : 489-95 

30 (1999), predicts that human genes average 5.5 exons/gene. 
Even with probes of 200 - 500 bp, the vast majority of 
■human EST microarray probes include more than one exon. 

In contrast, by virtue of their origin from 
algorithmically identified ORFs in genomic sequence, the 

35 probes in the genome-derived single exon microarrays of the 
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present invention can consist of individual exons . Thus, 
in contrast to EST microarrays, at least about 50, 60, 70, 
75, 80, 85, 95 or 99% of probes deposited in the genome- 
derived microarray of the present invention consist of, or 
5 include, no more than one predicted ORF. 

This provides the ability, not readily achieved 
using EST microarrays, to use the genome-derived single 
exon microarrays of the present invention to measure 
tissue-specific expression of individual exons, which in 
10 turn allows differential splicing events to be detected and 
characterized, and in particular, allows the correlation of 
differential splicing to tissue-specific expression 
patterns . 

Furthermore, the exons that are represented in 

15 EST microarrays are often biased toward the 3' or 5' end of 
their respective genes, since sequencing strategies used 
for EST identification are so biased. In contrast, no such 
3' or 5' bias necessarily inheres in the selection of exons 
for disposition on the genome-derived single exon 

20 microarrays of the present invention. 

Conversely, the probes provided on the genome- 
derived single exon microarrays of the present invention 
typically, but need not necessarily, include intronic 
and/or intergenic sequence that is absent from EST 

25 microarrays, which are derived from mature mRNA. 

Typically, at least about 50, 60, 70, 80 or 90% of the 
exon-including probes on the genome-derived single exon 
microarrays of the present invention include sequence drawn 
from noncoding regions. As discussed above, the additional 

30 presence of noncoding region does not significantly 

interfere with measurement of gene expression, and provides 
the additional opportunity to assay prespliced RNA, and 
thus measure such phenomena such as nuclear export control. 

The genome-derived single exon microarrays of the 

35 present invention are also quite different from in situ 
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synthesis microarrays, where probe size is severely 
constrained by inadequacies in the photolithographic 
synthesis process. 

Typically, probes arrayed on in situ synthesis 
5 microarrays are limited to a maximum of about 25 bp. As a 
well known consequence, hybridization to such chips must be 
performed at low stringency. In order, therefore, to 
achieve unambiguous sequence-specific hybridization 
results, the in situ synthesis microarray requires 

10 substantial redundancy, with concomitant programmed 

arraying for each probe of probe analogues with altered 
(i.e., mismatched) sequence. 

In contrast, the longer probe length of the 
genome-derived single exon microarrays of the present 

15 invention allows much higher stringency hybridization and 
wash. Typically, therefore, exon-including probes on the 
genome-derived single exon microarrays of the present 
invention average at least about 100, 200, 300, 400 or 
500 bp in length. By obviating the need for substantial 

20 probe redundancy, this approach permits a higher density of 
probes for discrete exons or genes to be arrayed on the 
microarrays of the present invention than can be achieved 
for in situ synthesis microarrays. 

A further distinction is that the probes in in 

25 situ .synthesis microarrays typically are covalently linked 
to the substrate surface. In contrast, the probes disposed 
on the genome-derived microarray of the present invention 
typically are, but need not necessarily be, bound 
noncovalently to the substrate. 

30 Furthermore, the short probe size on in situ 

microarrays causes large percentage differences in the 
melting temperature of probes hybridized to their 
complementary target sequence, and thus causes large 
percentage differences in the theoretically optimum 

35 stringency across the array as a whole. 
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In contrast, the larger probe size in the 
microarrays of the present invention create lower 
percentage differences in melting temperature across the 
range of arrayed probes. 
5 A further significant advantage of the 

microarrays of the present invention over in situ 
synthesized arrays is that the quality of each individual 
probe can be confirmed before deposition. In contrast, the 
quality of probes cannot be assessed on a probe-by-probe 
10 basis for the in situ synthesized microarrays presently 
being used. 

The genome-derived single exon microarrays of the 
present invention are also distinguished over, and present 
substantial benefits over, the genome-derived microarrays 

15 from lower eukaryotes such as yeast. Lashkari et al . , 
Proc. Natl. Acad. Sci. USA 94:13057-13062 (1997). 

Only about 220 - 250 of the 6100 or so nuclear 
genes in Saccharomyces cerevisiae — that is, only about 4 
- 5% - have standard, spliceosomal, introns, Lopez et al., 

20 Nucl. Acids Res. 28:85-86 (2000); Spingola et al., RNA 
5(2):221-34 (1999). Furthermore, the entire yeast genome 
has already been sequenced. These two facts permit the 
ready amplification and disposition of single-ORF amplicons 
on such microarray without the requirement for antecedent 

25 use of gene prediction and/or comparative sequence 
analyses . 

Thus, a significant aspect of the present 
invention is the ability to identify and to confirm 
expression of predicted coding regions in genomic sequence 

30 drawn from eukaryotic organisms that have a higher 

percentage of genes having introns than do yeast such as 
Saccharomyces cerevisiae, particularly in genomic sequence 
drawn from eukaryotes in which at least about 10, 20 or 50% 
of protein-encoding genes have introns. In preferred 

35 embodiments, the methods and apparatus of the present 
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invention are used to identify and confirm expression of 
novel genes from genomic sequence of eukaryotes in which 
the average number of introns per gene is at least about 
one, two or three or more. 
5 After the physical substrate is prepared, 

experimental verification of predicted function is 
performed. 

In a preferred embodiment of the present 
invention, where the function sought to be identified in 

10 genomic sequence is protein coding, experimental 

verification is performed by measuring expression of the 
putative ORFs, typically through nucleic acid hybridization 
experiments, and in particularly preferred embodiments, 
through hybridization to genome-derived single exon 

15 microarrays prepared as above- described. 

Expression is conveniently measured and expressed 
for each probe in the microarray as a ratio of the 
expression measured concurrently in a plurality of mRNA 
sources, according to techniques well known in the 

20 microarray art, Reviewed in Schena et al., and as further 
described in Example 2, below. The mRNA source for the 
reference against which specific expression is measured can 
be drawn from a homogeneous mRNA source, such as a single 
cultured cell-type, or alternatively can be heterogeneous, 

25 as from a pool of mRNA derived from multiple tissues and/or 
cell types, as further described in Example 2, infra. 

mRNA can be prepared by standard techniques, see 
Ausubel et al. and Maniatis et al., or purchased 
commercially. The mRNA is then typically reverse- 

30 transcribed in the presence of labeled nucleotides: the 
index source (that in which expression is desired to be 
measured) is reverse- transcribed in the presence of 
nucleotides labeled with a first label, typically a 
fluorophore (f luorochrome; fluor; fluorescent dye) ; the 

35 reference source is reverse transcribed in the presence of 
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a second label, typically a fluorophore, typically 
f luorometrically-distinguishable from the first label. As 
further described in Example 2, infra, Cy3 and Cy5 dyes 
prove particularly useful in these methods. After partial 
5 purification of the index and reference targets, 

hybridization to the probe array is conducted according to 
standard techniques, typically under a coverslip. 

After wash, microarrays are conveniently scanned 
using a commercial microarray scanning device, such as a 

10 Gen3 Scanner (Molecular Dynamics, Sunnyvale, CA) . Data on 
expression is then passed, with or without interim storage, 
to process 500, where the results for each probe are 
related to the original sequence. 

Often, hybridization of target material to the 

15 genome-derived single exon microarray will identify certain 
of the probes thereon as of particular interest. Thus, it 
is often desirable that the user be able readily to obtain 
sufficient quantities of an individual probe, either for 
subsequent arrayed deposition upon an additional support 

20 substrate, often as part of a microarray having a plurality 
of probes so identified, or alternatively or additionally 
as a solitary solid-phase or solution-phase probe, for 
further use. 

Thus, in another aspect, the present invention 

25 provides compositions and kits for the ready production of 
nucleic acids identical in sequence to, or substantially 
identical in sequence to, probes on the genome-derived 
single exon microarrays of the present invention. 

In this aspect, a small quantity of each probe is 

30 disposed, typically without attachment to substrate, in a 
spatially-addressable ordered set, typically one per well 
of a microtiter dish. Although a 96 well microtiter plate 
can be used, greater efficiency is obtained using higher 
density arrays, such as are provided by microtiter plates 

35 having 384, 864, 1536, 3456, 6144, or 9600 wells, and 
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although microtiter plates having physical depressions 
(wells) are conveniently used, any device that permits 
addressable withdrawal of reagent from fluidly- 
noncommunicating areas can be used. 
5 In this aspect of the invention, therefore, a 

fluidly noncommunicating addressable ordered set of 
individual probes, corresponding to those on a genome- 
derived single exon microarray, is provided, with each 
probe in sufficient quantity to permit amplification, such 

10 as by PCR. As earlier mentioned, the ORF-specific 

5' primers used for genomic amplification can have a first 
common sequence added thereto, and the ORF-specific 3' 
• primers used for genomic amplification can have a second, 
different, common sequence added thereto, thus permitting, 

15 in this preferred embodiment, the use of a single set of 5' 
and 3' primers to amplify any one of the probes from the 
amplifiable ordered set. 

Each discrete amplifiable probe can also be 
packaged with amplification primers, solutes, buffers, 

20 etc., and can be provided in dry (e.g., lyophilized) form 
or wet, in the latter case typically with addition of 
agents that retard evaporation. 

In another aspect of the present invention, a 
genome-derived single-exon microarray is packaged together 

25 with such an ordered set of amplifiable probes 

corresponding to the probes, or one or more subsets of 
probes, thereon. In alternative embodiments, the ordered 
set of amplifiable probes is packaged separately from the 
genome-derived single exon microarray. 

30 In some embodiments, the microarray and/or 

ordered probe set are further packaged with recordable 
media that provide probe identification and addressing 
information, and that can additionally contain annotation 
information, such as gene expression data. Such recordable 

35 media can be packaged with the microarray, with the ordered 
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probe set, or with both. 

If the microarray is constructed on a substrate 
that incorporates recordable media, such as is described in 
international patent application no. WO 98/12559, then 
5 separate packaging of the genome-derived single exon 
microarray and the bioinf ormatic information is not 
required. 

The amount of amplifiable probe material should 
be sufficient to permit at least one amplification 

10 sufficient for subsequent hybridization assay. 

Although the use of high density genome-derived 
microarrays on solid planar substrates is presently a 
preferred approach for the physical confirmation and 
characterization of the expression of sequences predicted 

15 to encode protein, other types of microarrays (as herein 
defined) can also be used. 

Furthermore, as earlier mentioned, experimental 
verification of the function predicted from genomic 
sequence in process 200 can be bioinf ormatic, rather than, 

20 or additional to, physical verification. 

For example, where the function desired to be 
identified is protein coding, the predicted ORFs can be 
compared bioinf ormatically to sequences known or suspected 
of being expressed. 

25 Thus, the sequences output from process 300 (or 

process 200) , can be used to query expression databases, 
such as EST databases, SNP ("single nucleotide 
polymorphism") databases, known cDNA and mRNA sequences, 
SAGE ("serial analysis of gene expression") databases, and 

30 more generalized seguence databases that allow query for 
expressed sequences. Such query can be done by any 
sequence query algorithm, such as BLAST ("basic local 
alignment search tool"). The results of such query — 
including information on identical sequences and 

35 information on nonidentical sequences that have diffuse or 
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focal regions of sequence homology to the query sequence — 
can then be passed directly to process 500, or used to 
inform analyses subsequently undertaken in process 200, 
process 300, or process 400. 
5 Experimental data, whether obtained by physical 

or bioinf ormatic assay in process 400, is passed to process 
500 where it is usefully related to the sequence data 
itself, a process colloquially termed "annotation". Such 
annotation can be done using any technique that usefully 

10 relates the functional information to the sequence, as, for 
example, by incorporating the functional data into the 
record itself, by linking records in a hierarchical or 
relational database, by linking to external databases, or 
by a combination thereof. Such database techniques are 

15 well within the skill in the art. 

The annotated sequence data can be stored 
locally, uploaded to genomic sequence database 100, and/or 
displayed 800. 

The methods and apparatus of the present 

20 invention rapidly produce functional information from 
genomic sequence. Coupled with the escalating pace at 
which sequence now accumulates, the rapid pace of sequence 
annotation produces a need for methods of displaying the 
information in meaningful ways. 

25 FIG. 3 shows visual display 80 presenting a ' 

single genomic sequence annotated according to the present 
invention. Because of its nominal resemblance to artistic 
works of Piet Mondrian, visual display 80 is alternatively 
described herein as a "Mondrian". 

30 Each of the visual elements of display 80 is 

aligned with respect to the genomic sequence being 
annotated (hereinafter, the "annotated sequence") . Given 
the number of nucleotides typically represented in an 
annotated sequence, representation of individual 

35 nucleotides would rarely be readable in hard copy output of 
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display 80. Typically, therefore, the annotated sequence 
is schematized as rectangle 89, extending from the left 
border of display 80 to its right border. By convention 
herein, the left border of rectangle 89 represents the 
5 first nucleotide of the sequence and the right border of 
rectangle 8 9 represents the last nucleotide of the 
sequence . 

As further discussed below, however, the Mondrian 
visual display of annotated sequence can serve as a 

10 convenient graphical user interface for computerized 

representation, analysis, and query of information stored 
electronically. For such use, the individual nucleotides 
can conveniently be linked to the X axis coordinate of 
rectangle 89. This permits the annotated sequence at any 

15 point within rectangle 89 readily to be viewed, either 

automatically — for example, by time-delayed appearance of 
a small overlaid window upon movement of a cursor or other 
pointer over rectangle 89 — or through user intervention, 
as by clicking a mouse or other pointing device at a point 

20 in rectangle 89. 

Visual display 80 is generated after user 
specification of the genomic sequence to be displayed. 
Such specification can consist of or include an accession 
number for a single clone (e.g., a single BAC accessioned 

25 into GenBank) , wherein the starting and stopping 
nucleotides are thus absolutely identified, or 
alternatively can consist of or include an anchor or 
fulcrum point about which a chosen range of sequence is 
anchored, thus providing relative endpoints for the 

30 sequence to be displayed. For example, the user can anchor 
such a range about a given chromosomal map location, gene 
name, or even a sequence returned by query for similarity 
or identity to an input query sequence. When visual 
display 80 is used as a graphical user interface to 

35 computerized data, additional control over the first and 
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last displayed nucleotide will typically be dynamically 
selectable, as by use of standard zooming and/or selection 
tools . 

Field 81 of visual display 80 is used to present 
5 the output from process 200, that is, to present the 
bioinf ormatic prediction of those sequences having the 
desired function within the genomic sequence. Functional 
sequences are typically indicated by at least one rectangle 
83 (83a, 83b, 83c), the left and right borders of which 

10 respectively indicate, by their X-axis coordinates, the 

starting and ending nucleotides of the region predicted to 
have function. 

Where a single bioinf ormatic method or approach 
identifies a plurality of regions having the desired 

15 function, a plurality of rectangles 83 is disposed 

horizontally in field 81. Where multiple methods and/or 
approaches are used to identify function, each such method 
and/or approach can be represented by its own series of 
horizontally disposed rectangles 83, each such horizontally 

20 disposed series of rectangles offset vertically from those 
representing the results of the other methods and 
approaches . 

Thus, rectangles 83a in FIG. 3 represent the 
functional predictions of a first method of a first 
25 approach for predicting function, rectangles 83b represent 
the functional predictions of a second method and/or second 
approach for predicting that function, and rectangles 83c 
represent the predictions of a third method and/or 
approach. 

30 Where the function desired to be identified is 

protein coding, field 81 is used to present the 
bioinf ormatic prediction of sequences encoding protein. 
For example, rectangles 83a can represent the results from 
GRAIL or GRAIL II, rectangles 83b can represent the results 

35 from GENEFINDER, and rectangles 8 3c can represent the 
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results from DICTION. 

Optionally, and preferably, rectangles 83 
collectively representing predictions of a single method 
and/or approach are identically colored and/or textured, 
5 and are distinguishable from the color and/or texture used 
for a different method and/or approach. 

Alternatively, or in addition, the color, hue, 
density, or texture of rectangles 83 can be used further to 
report a measure of the bioinf ormatic reliability of the 

10 prediction. For example, many gene prediction programs 
will report a measure of the reliability of prediction. 
Thus, increasing degrees of such reliability can be 
indicated, e.g., by increasing density of shading. Where 
display 80 is used as a graphical user interface, such 

15 measures of reliability, and indeed all other results 

output by the program, can additionally or alternatively be 
made accessible through linkage from individual rectangles 
83, as by time-delayed window ("tool tip" window), or by 
pointer (e.g., mouse) -activated link. 

20 As earlier described, increased predictive 

reliability can be achieved by requiring consensus among 
methods and/or approaches to determining function. Thus, 
field 81 can include a horizontal series of rectangles 83 
that indicate one or more degrees of consensus in 

25 predictions of function. 

Although FIG. 3 shows three series of 
horizontally disposed rectangles in field 81, display 80 
can include as few as one such series of rectangles and as 
many as can discriminably be displayed, depending upon the 

30 number of methods and/or approaches used to predict a given 
function. 

Furthermore, field 81 can be used to show 
predictions of a plurality of different functions. 
However, the increased visual complexity occasioned by such 
35 display makes more useful the ability of the user to select 
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a single function for display. When display 80 is used as 
a graphical user interface for computer query and analysis, 
such function can usefully be indicated and user- 
selectable, as by a series of graphical buttons or tabs 
5 (not shown in FIG. 3) . 

Rectangle 89 is shown in FIG. 3 as including 
interposed rectangle 84. Rectangle 84 represents the 
portion of annotated sequence for which predicted 
functional information has been assayed physically, with 

10 the starting and ending nucleotides of the assayed material 
indicated by the X axis coordinates of the left and right 
borders of rectangle 84. Rectangle 85, with optional 
inclusive circles 86 (86a, 86b, and 86c) displays the 
results of such physical assay. 

15 Although a single rectangle 84 is shown in FIG. 

3, physical assay is not limited to just one region of 
annotated genomic sequence. It is expected that an 
increasing percentage of regions predicted to have function 
by process 200 will be assayed physically, and that display 

20 80 will accordingly, for any given genomic sequence, have 
an increasing number of rectangles 84 and 85, representing 
an increased density of sequence annotation. 

Where the function desired to be identified is 
protein coding, rectangle 84 identifies the sequence of the 

25 probe used to measure expression. In embodiments of the 
present invention where expression is measured using 
genome-derived single exon microarrays, rectangle 84 
identifies the sequence included within the probe 
immobilized on the support surface of the microarray. As 

30 noted supra, such probe will often include a small amount 
of additional, synthetic, material incorporated during 
amplification and designed to permit reamplif ication of the 
probe, which sequence is typically not shown in display 80. 

Rectangle 87 is used to present the results of 

35 bioinf ormatic assay of the genomic sequence. For example, 
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where the function desired to be identified is protein 
coding, process 400 can include bioinf ormatic query of 
expression databases with the sequences predicted in 
process 200 to encode exons. And as earlier discussed, 
5 because bioinf ormatic assay presents fewer constraints than 
does physical assay, often the entire output of process 200 
can be used for such assay, without further subsetting 
thereof by process 300. Therefore, rectangle 87 typically 
need not have separate indicators therein of regions 

10 submitted for bioinf ormatic assay; that is, .rectangle 87 
typically need not have regions therein analogous to 
rectangles 84 within rectangle 89. 

Rectangle 87 as shown in FIG. 3 includes smaller 
rectangles 880 and 88. Rectangles 880 indicate regions 

15 that returned a positive result in the bioinf ormatic assay, 
with rectangles 88 representing regions that did not return 
such positive results. Where the function desired to be 
predicted and displayed is protein coding, rectangles 880 
indicate regions of the predicted exons that identify 

20 sequence with significant similarity in expression 
databases, such as EST, SNP, SAGE databases, with 
rectangles 88 indicating genes novel over those identified 
in existing expression data bases. 

Rectangles 880 can further indicate, through 

25 color, shading, texture, or the like, additional 
information obtained from bioinf ormatic assay. 

For example, where the function assayed and 
displayed is protein coding, the degree of shading of 
rectangles 880 can be used to represent the degree of 

30 sequence similarity found upon query of expression 

databases. The number of levels of discrimination can be 
as few as two (identity, and similarity, where similarity 
has a user-selectable lower threshold) . Alternatively, as 
many different levels of discrimination can be indicated as 

35 can visually be discriminated. 
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Where display 80 is used as a graphical user 
interface, rectangles 880 can additionally provide links 
directly to the sequences identified by the query of 
expression databases, and/or statistical summaries thereof. 
5 As with each of the precedingly-discussed uses of display 
80 as a graphical user interface, it should be understood 
that the information accessed via display 80 need not be 
resident on the computer presenting such display, which 
often will be serving as a client, with the linked 
10 information resident on one or more remotely located 
servers . 

Rectangle 85 displays the results of physical 
assay of the sequence delimited by its left and right 
borders . 

15 Rectangle 85 can consist of a single rectangle, 

thus indicating a single assay, or alternatively, and 
increasingly typically, will consist of a series of 
rectangles (85a, 85b, 85c) indicating separate physical 
assays of the same sequence. 

20 Where the function assayed is gene expression, 

and where gene expression is assayed as herein described 
using simultaneous two-color fluorescent detection of 
hybridization to genome-derived single exon microarrays, 
individual rectangles 85 can be colored to indicate the 

25 degree of expression relative to control. Conveniently, 
shades of green can be used to depict expression in the 
sample over control values, and shades of red used to 
depict expression less than control, corresponding to the 
spectra of the Cy3 and Cy5 dyes conventionally used for 

30 respective labeling thereof. Additional functional 

information can be provided in the form of circles 8 6 (8 6a, 
8 6b, 8 6c), where the diameter of the circle can be used to 
indicate expression intensity. As discussed infra, such 
relative expression (expression ratios) and absolute 

35 expression (signal intensity) can be expressed using 
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normalized values. 

Where display 80 is used as a graphical user 
interface, rectangle 85 can be used as a link to further 
information about the assay. For example, where the assay 
5 is one for gene expression, each rectangle 85 can be used 
to link to information about the source of the hybridized 
mRNA, the identity of the control, raw or processed data 
from the microarray scan, or the like. 

FIG. 4 is rendition of display 80 representing 

10 gene prediction and gene expression for a hypothetical BAC, 
showing conventions used in the Examples presented infra. 
BAC sequence ("Chip seq.") 89 is presented, with the 
physically assayed region thereof (corresponding to 
rectangle 84 in FIG. 3) shown in white. Algorithmic gene 

15 predictions are shown in field 81, with predictions by 

GRAIL shown, predictions by GENEFINDER, and predictions by 
DICTION shown. Within rectangle 87, regions of sequence 
that, when used to query expression databases, return 
identical or similar sequences ("EST hit") are shown as 

20 white rectangles (corresponding to rectangles 880 in FIG. 
3) , gray indicates low homology, and black indicates 
unknowns (where black and gray would correspond to 
rectangles 88 in FIG. 3) . 

Although FIGS. 3 and 4 show a single stretch of 

25 sequence, uninterrupted from left to right, longer 

sequences are usefully represented by vertical stacking of 
such individual Mondrians, as shown in FIGS. 9 and 10. 

Single Exon Probes Useful For Measuring Gene Expression 

30 

The methods and apparatus of the present 
invention rapidly produce functional information from 
genomic sequence. Where the function to be identified is 
protein coding, the methods and apparatus of the present 
35 invention rapidly identify and confirm the expression of 
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portions of genomic sequence that function to encode 
protein. As a direct result, the methods and apparatus of 
the present invention rapidly yield large numbers of 
single-exon nucleic acid probes, the majority from 
5 previously unknown genes, each of which is useful for 

measuring and/or surveying expression of a specific gene in 
one or more tissues or cell types. 

It is, therefore, another aspect of the present 
invention to provide genome-derived single exon nucleic 

10 acid probes useful for gene expression analysis, and 

particularly for gene expression analysis by microarray. 

Using the methods and genome-derived single-exon 
microarrays of the present invention, we have for example 
readily identified a large number of unique ORFs from human 

15 genomic sequence. Using single exon probes that encompass 
these ORFs, we have demonstrated, through microarray 
hybridization analysis, the expression of 9,980 of these 
ORFs in heart. 

As would immediately be appreciated by one of 

20 skill in the art, each single exon probe having 

demonstrable expression in heart is currently available for 
use in measuring the level of its ORF's expression in 
heart. 

Diseases of the heart and vascular system are a 
25 significant cause of human morbidity and mortality. 
Increasingly, genetic factors are being found that 
contribute to predisposition, onset, and/or aggressiveness 
of most, if not all, of these diseases. Although mutations 
in single genes have on occasion been identified as 
30 causative, these disorders are for the most part believed 
to have polygenic etiologies. 

For example, cardiovascular disease (CVD) , which 
includes coronary heart disease, stroke, and peripheral 
arterial vascular disease, is the leading cause of death in 
35 the United States and other developed countries. In 
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developing regions, coronary heart disease and stroke are 
ranked second and third, respectively, as causes of 
mortality. In the United States alone, about 1 million 
deaths (about 42% of total deaths per year) result from CVD 
5 each year. CVD is also a significant cause of morbidity, 
with about 1.5 million people suffering myocardial 
infarction, and about 500,000 suffering strokes in the 
United States each year. With risk for CVD increasing with 
age, and an increasingly aging population, CVD will 

10 continue to be a major health problem into the future. 

CVD is caused by arterial lesions that begin as 
fatty streaks, which consist of lipid-laden foam cells, and 
develop into fibrous plaques. The atherosclerotic plaque 
may grow slowly, ' and over several decades may produce a 

15 severe stenosis or result in arterial occlusion. Some 
plaques are stable, but other, more unstable, ones may 
rupture and induce thrombosis. The thrombi may embolize, 
rapidly occluding the lumen and leading to myocardial 
infarction or acute ischemic syndrome. 

20 Risk factors for CVD include age and gender. In 

addition, a family history of CVD significantly increases 
risk, indicating a genetic basis for development of this 
disease complex. Obesity, especially truncal obesity, the 
cause of which is suspected to be genetic, is yet another 

25 risk factor for CVD. Familial disorders such as 
hyperlipidemia, hypoalphalipoproteinemia, 
hypertriglyceridemia, hypercholesterolemia, 
hyperinsulinemia, homocystinuria, and 

dysbetalipoproteinemia, all of which lead to lipid or 
30 lipoprotein abnormalities, can predispose one to the 
development of CVD. Both insulin-dependent and non- 
insulin-dependent diabetes mellitus, both of which have 
genetic components, have been also linked to the 
development of atherosclerosis. 
35 The literature is replete with evidence for 
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genetic causes of cardiovascular diseases. For example, 
studies by Allayee et al., Am. J. Hum. Genet. 63:577- 
585(1998), indicated a genetic association between familial 
combined hyperlipidemia (FCHL) and small dense LDL 
5 particles. The studies also concluded that the genetic 
determinants for LDL particle size are shared, at least .in 
part, among FCHL families and the more general population 
at risk for CVD. Juo et al., Am. J. Hum. Genet. 63: 586- 
594 (1998) demonstrated that small, dense LDL particles and 

10 elevated apolipoprotein B levels, both of which are 

commonly found in members of FCHL families, share a common 
major gene plus individual polygenic components. 

The common major gene was estimated to explain 37% of 
the variants of adjusted LDL particle size and 23% of the 

15 variants of adjusted apoB levels. 

The atherogenic lipoprotein phenotype (ALP) is a 
common heritable trait, symptoms of which include a 
prevalence of small, dense LDL particles, increased levels 
of triglyceride-rich lipoproteins, reduced levels of high 

20 density lipoprotein, and increased risk of CVD, 

particularly myocardial infarction. Both Nishina et al., 
Proc. Nat. Acad. Sci. 89: 708-712 (1992) and Rotter et al . , 
Am. J. Hum. Genet. 58: 585-594(1996) demonstrated linkage 
between ALP and the LDLR locus. Rotter et al . , supra, also 

25 reported linkage to the CETP locus on chromosome 16 and to 
the SOD1 locus on chromosome 6, and possibly also to the 
APOA1/APOC3/APOA4 cluster on chromosome 11. 

Mutations in genes identified as components of 
lipid metabolism, e.g., apolipoprotein E (apoE) and LDL 

30 receptor (LDLR) , have been shown to be associated with 
predisposition to the development of CVD. For example, 
several apoE variants had been found to be associated with 
familial dysbetalipoproteinemia, characterized by elevated 
plasma cholesterol and triglyceride levels and an increased 

35 risk for atherosclerosis (de Knijff et al . , Mutat 4: 178- 
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194 (1994)). Mutations in the LDLR gene have been 
associated with the familial hypercholesterolemia, an 
autosomal dominant disorder characterized by elevation of 
serum cholesterol bound to low density lipoprotein (LDL) , 
5 that can lead to increased susceptibility to CVD. 

To date, mutations in numerous genes have been 
shown to be associated with increased CVD susceptibility. 
However, the identified genetic associations are believed 
not to account for all genetic contributions to CVD. 

10 As yet another example, hypertension is a major 

health problem because of its high prevalence and its 
association with increased risk of CVD. Approximately 25% 
of all adults and over 60% of persons older than 60 years 
in the United States have high blood pressure. 

15 Arterial or systemic hypertension is diagnosed 

when the average of two or more diastolic BP measurements 
on at least two subsequent visits is 90 mm Hg or more, or 
when the average of multiple systolic BP readings on two or 
more subsequent visits is consistently greater than 140 mm 

20 Hg. Pulmonary hypertension is defined as pressure within 
the pulmonary arterial system elevated above the normal 
range; pulmonary hypertension may lead to right ventricle 
(RV) failure. 

Hypertension, together with other cardiovascular 

25 risk factors, leads to atherosclerosis and other forms of 
CVD, primarily by damaging the vascular endothelium. In 
more than 40% of the U.S. population, hypertension is 
accompanied by hyperlipidemia and leads to the development 
of atherosclerotic plaques. In the absence of 

30 hyperlipidemia, intimal thickening occurs. Non- 
atherosclerotic hypertension-induced vascular damage can 
lead to stroke or heart failure. 

Familial diseases associated with secondary 
hypertension include familial renal disease, polycystic 

35 kidney disease, medullary thyroid cancer, pheochromocytoma, 
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and hyperparathyroidism. Hypertension is also twice as 
common in patients with diabetes mellitus. 

More than 95% of all hypertension cases are 
essential hypertension, that is, lack identifiable 
5 antecedent clinical cause. Essential hypertension shows 
clustering in families and can result from a variety of 
genetic diseases. In most cases, high blood pressure 
results from a complex interaction of factors with both 1 
genetic and environmental components. The recent search 

10 for genes that contribute to the development of essential 
hypertension has shown that the disorder is polygenic in 
origin. However, with several exceptions (such as 
angiotensinogen, angiotensin receptor-1, beta-3 subunit of 
guanine nucleotide-binding protein, tumor necrosis factor 

15 receptor-2, and "-adducin) , the particular genes involved 
are still being sought. 

Susceptibility loci for essential hypertension 
have been mapped to chromosomes 17 and 15q. Hasstedt et 
al., Am. J. Hum. Genet. 43: 14-22 (1988) measured red cell 

20 sodium in 1,800 normotensive members of 16 Utah pedigrees 
ascertained through hypertensive or normotensive probands, 
siblings with early stroke death, or brothers with early 
coronary disease, and suggested that red blood cell sodium 
was determined by 4 alleles at a single locus. This major 

25 locus was thought to explain 29% of the variance in red 
cell sodium, and polygenic inheritance explained another 
54.6%. A higher frequency of the high red blood cell sodium 
genotype in pedigrees in which the proband was hypertensive 
rather than normotensive provided evidence that this major 

30 locus increases susceptibility to hypertension. 

From a study of systolic blood pressure in 278 
pedigrees, Perusse et al., Am. J. Hum. Genet. 49: 94-105 
(1991) reported that variability in systolic blood pressure 
is likely influenced by allelic variation of a single gene, 

35 with gender and age dependence. They also suggested that a 
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single gene may be associated with a steeper increase of 
blood pressure with age among males and females. 

There is strong evidence, however, for additional 
as yet uncharacterized, hypertension-associated loci on 
5 other chromosomes. 

For example, Xu et al . , Am. J. Hum. Genet. 64: 
1694-1701 (1999) carried out a systematic search for 
chromosomal regions containing genes that regulate blood 
pressure by scanning the entire autosomal genome using 367 

10 polymorphic markers. Because of the sampling design, the 
number of sib pairs, and the availability of genotyped 
parents, this study represented one of the most powerful of 
its kind. Although no regions achieved a 5% genomewide 
significance level, maximum lod scores were greater than 

15 2.0 for regions of chromosomes 3, 11, 15, 16, and 17. 

As another example, cardiac arrhythmias account 
for several thousand deaths each year. Arrhythmias such as 
ventricular fibrillation, which causes more than 300,000 
sudden deaths annually in the United States alone, 

20 encompass a multitude of disorders. Another type of 

arrhythmia, idiopathic dilated cardiomyopathy, of which 
familial dilated cardiomyopathy accounts for 20-25%, is 
responsible for more than 10,000 deaths in the United 
States annually and is the predominant indication for 

25 cardiac transplantation. 

Cardiac arrhythmias can be divided into 
bradyarrhythmias (slowed rhythms) or tachyarrhythmias 
(speeded rhythms) . Bradyarrhythmias result from 
abnormalities of intrinsic automatic behavior or 

30 conduction, primarily within the atrioventricular node and 
the His-Purkinje ' s network. Tachyarrhythmias are caused by 
altered automaticity, reentry, or triggered automaticity . 

Bradyarrhythmias arising from suspected polygenic 
disorders include Long QT syndrome 4, atrioventricular 

35 block, familial sinus node disease, progressive cardiac 
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conduction defect, and familial cardiomyopathy. 
Tachyarrhythmias with possible underlying polygenic causes 
include familial ventricular tachycardia, Wolff-Parkinson- 
White syndrome, familial arrhythmogenic right ventricular 
5 dysplasia, heart-hand syndrome V, Mai de Meleda, familial 
ventricular fibrillation, and familial noncompaction of 
left ventricular myocardium. 

For some of the arrhythmias, one or more of the 
causative genes have been identified. 

10 For example, atrioventricular block has been 

associated with mutations in the SCN5A gene, as well as 
mutations in a locus mapped to 19ql3. Studies have shown 
linkage of familial sinus node disease to a marker on 
10q22-q24. Familial ventricular tachycardia has been 

15 linked to mutations in genes encoding the G protein subunit 
alpha-i2 (GNAI1) , and/or related genes. Examination of 
families with Wolff-Parkinson-White syndrome suggest an 
autosomal dominant pattern of inheritance and evidence of 
linkage of the disorder to DNA markers on band 7q3. 

20 Linkage analysis shows strong evidence for localization of 
a gene for Mai de Meleda disease on 8qter. Familial 
ventricular fibrillation can be caused by mutations in the 
cardiac sodium channel gene SCN5A. Familial noncompaction 
of left ventricular myocardium has been linked to mutations 

25 in the gene encoding tafazzin (TAZ ) , or in the FK506- 
binding protein 1A gene (FKBP1A) . 

Familial dilated cardiomyopathy is characterized 
by an autosomal dominant pattern of inheritance with age- 
related penetrance. The linkage of familial dilated 

30 cardiomyopathy to several loci indicate that it is 

polygenic. These loci include CMD1A on lpll-qll, CMDlB on 
9ql3, CMD1C on 10q21, CMD1D on lq32, CMD1E on 3p, CMD1F on 
6q, CMD1G on 2q31, CMDlH on 2ql4-q22, and CMD1I, which 
results from mutation in the DES gene on 2q35. 

35 In addition, cardiomyopathy can also be caused by 
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mutations in the ACTC gene, the cardiac beta-myosin heavy 
chain gene (MYH7), or the cardiac troponin T gene. 

Familial arrhythmogenic right ventricular 
dysplasia is inherited as an autosomal dominant with 
5 reduced penetrance and is one of the major genetic causes 
of juvenile sudden death. It is estimated that the 
prevalence of familial arrhythmogenic right ventricular 
dysplasia ranges from 6 per 10,000 in the general 
population to 4 . 4 per 1,000 in some areas. 

10 Several loci for familial arrhythmogenic right 

ventricular dysplasia have been mapped indicating that this 
disease is also polygenic in nature. These loci include 
ARVD1 on 14q23-q24, ARVD2 on Iq42-q43, ARVD3 on 14ql2-q22, 
ARVD4 on 2q32 . l-q32 . 3 , ARVD5 on 3p23, and ARVD6 on 10pl4- 

15 p!2. 

Progressive cardiac conduction defect (PCCD) , 
also called Lenegre-Lev disease, is one of the most common 
cardiac conduction diseases. It is characterized by 
progressive alteration of cardiac conduction through the 

20 His-Purkinje system with right or left bundle branch block 
and widening of QRS complexes, leading to complete 
atrioventricular block and ultimately causing syncope and 
sudden death. It represents the major cause of pacemaker 
implantation in the world (0.15 implantations per 1,000 

25 inhabitants per year in developed countries) . The cause of 
PCCD is unknown but familial cases with right bundle branch 
block have been reported suggesting that at least some 
cases are of genetic origin. Reports have linked PCCD to 
HBl on 19ql3.3, and to mutations in the SCN5A gene (Schott 

30 et al., Nature Genet. 23: 20-21 (1999)). 

As yet a further example, congenital heart 
disease occurs at a rate of 8 per 1000 live births, which 
corresponds to approximately 32,000 infants with newly 
diagnosed congenital heart disease each year in the United 

35 States. Twenty percent of infants with congenital heart 
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disease die within the first year of life. Approximately 
80% of the first-year survivors live to reach adulthood. 
Congenital heart disease also has economic impact due to 
the estimated 20,000 surgical procedures performed to 
5 correct circulatory defects in these patients. The 

estimated number of adults with congenital heart disease in 
the United States is currently about 900,000. 

In 90% of patients, congenital heart disease is 
attributable to multifactorial inheritance. Only 5-10% of 

10 malformations are due to primary genetic factors, which are 
either chromosomal or a result of a single mutant gene. 

The most common congenital heart disease found in 
adults is bicuspid aortic valve. This defect occurs in 2% 
of the general population and accounts for approximately 

15 50% of operated cases of aortic stenosis in adults. Atrial 
septal defect is responsible for 30-40% of congenital heart 
disease seen in adults. The most common congenital cardiac 
defect observed in the pediatric population is ventricular 
septal defect, which accounts for 15-20% of all congenital 

20 lesions. Tetralogy of Fallot is the most common cyanotic 
congenital anomaly observed in adults. Other congenital 
heart diseases include Eisenmenger 1 s syndrome, patent 
ductus arteriosus, pulmonary stenosis, coarctation of the 
aorta, transposition of the great arteries, tricuspid 

25 atresia, univentricular heart, Ebstein's anomaly, and 
double-outlet right ventricle. 

A number of studies have identified putative 
genetic loci associated with one or more congenital heart 
diseases . 

30 Congenital heart disease affects more than 40% of 

all Down syndrome patients. The candidate chromosomal 
region containing the putative gene or genes for congenital 
heart disease associated with Down syndrome is 21q22.2- 
q22.3, between ETS2 and MX1. 

35 DiGeorge syndrome (DGS) is characterized by 
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several symptoms including outflow tract defects of the 
heart such as teratology of Fallot. Most cases result from 
a deletion of chromosome 22qll.2 (the DiGeorge syndrome 
chromosome region, or DGCR) . The 22qll deletion is the 
5 second most common cause of congenital heart disease after 
Down syndrome. Several genes are lost in this deletion 
including the putative transcription factor TUPLE1. This 
deletion is associated with a variety of phenotypes, e.g., 
Shprintzen syndrome; conotruncal anomaly face (or Takao 

10 syndrome) ; and isolated outflow tract defects of the heart 
including Tetralogy of Fallot, truncus arteriosus, and 
interrupted aortic arch. 

Whereas 90% of case-s of DGS may now be attributed 
to a 22qll deletion, other associated chromosome defects 

15 have been identified. For example, Greenberg et al., Am. 
J. Hum. Genet. 43:605-611 (1988), reported 1 case of DGS 
with dell0pl3 and one with a 18q21.33 deletion. Fukushima 
et al., Am. J. Hum. Genet. 51 (suppl.):A80 (1992) reported 
linkage with a deletion of 4q21.3-q25. Gottlieb et al . , 

20 Am. J. Hum. Genet. 62: 495-498 (1998) concluded that the 
deletion of more than 1 region on lOp could be associated 
with the DGS phenotype. The association of the DiGeorge 
syndrome with at least 2 and possibly more chromosomal 
locations suggests strongly the involvement of several 

25 genes in this disease. 

Digilio et al., J. Med. Genet. 34: 188-190 
(1997), calculated empiric risk figures for recurrence of 
isolated Tetralogy of Fallot in families after exclusion of 
del(22qll), and concluded that gene(s) different from those 

30 located on 22qll must be involved in causing familial 

aggregation of nonsyndromic Tetralogy of Fallot. Johnson 
et al., Am. J. Med. Genet. (1997) conducted a cytogenetic 
evaluation of 159 cases of Tetralogy of Fallot. They 
reported that a del(22qll) was identified in 14% who 

35 underwent fluorescence in situ hybridization (FISH) testing 
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,-with the N25 cosmid probe. 

Other congenital heart disease are also suspected 
to b§ of polygenic origin. For example, Holmes et al., 
Birth Defects Orig. Art. Ser. X(4): 228-230 (1974) 
5 described familial clustering of hypoplastic left heart 
syndrome in siblings consistent with multifactorial 
causation . 

Other significant diseases of the heart and 
vascular system are also believed to have a genetic, 

10 typically polygenic, etiological component. These diseases 
include, for example, hypoplastic left heart syndrome, 
cardiac valvular dysplasia, Pfeiffer cardiocranial 
syndrome, oculof aciocardiodental syndrome, Kapur-Toriello 
syndrome, Sonoda syndrome, Ohdo Blepharophimosis syndrome, 

15 heart-hand syndrome, Pierre-Robin syndrome, Hirschsprung 
disease, Kousseff syndrome, Grange occlusive arterial 
syndrome, Kearns-Sayre syndrome, Kartagener syndrome, 
Alagille syndrome, Ritscher-Schinzel syndrome, Ivemark 
syndrome, Young-Simpson syndrome, hemochromatosis, 

20 Holzgreve syndrome, Barth syndrome, Smith-Lemli-Opit z 

syndrome, glycogen storage disease, Gaucher-like disease, 
Fabry disease, Lowry-Maclean syndrome, Rett syndrome, Opitz 
syndrome, Marfan syndrome, Miller-Dieker lissencephaly 
syndrome, mucopolysaccharidosis, Bruada syndrome, 

25 humerospinal dysostosis, Phaver syndrome, McDonough 
syndrome, Marfanoid hypermobility syndrome, 
atransf errinemia, Cornelia de Lange syndrome, Leopard 
syndrome, Diamond-Blackf an anemia, Steinfeld syndrome, 
progeria, and Williams-Beuren syndrome. 

30 The human genome-derived single exon nucleic acid 

probes and microarrays of the present invention are useful 
for predicting, diagnosing, grading, staging, monitoring 
and prognosing diseases of human heart and vascular system, 
particularly those diseases with polygenic etiology. With 

35 each of the single exon probes described herein shown to be 
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expressed at detectable levels in human heart, and with 
about 2/3 of the probes identifying novel genes, the single 
exon microarrays of the present invention provide 
exceptionally high informational content for such studies. 
5 For example, diagnosis (including differential 

diagnosis among clinically indistinguishable disorders) , 
staging, and/or grading of a disease can be based upon the 
guantitative relatedness of a patient gene expression 
profile to one or more reference expression profiles known 

10 to be characteristic of a given heart or vascular disease, 
or to specific grades or stages thereof. 

In one embodiment, the patient gene expression 
profile is generated by hybridizing nucleic acids obtained 
directly or indirectly from transcripts expressed in the 

15 patient's heart or vascular tissues to the genome-derived 
single exon microarray of the present invention. Reference 
profiles are obtained similarly by hybridizing nucleic 
acids obtained directly or indirectly from transcripts 
expressed in heart or vascular tissue of individuals with 

20 known disease. Methods for quantitatively relating gene 
expression profiles, without regard to the function of the 
protein encoded by the gene, are disclosed in WO 99/58720, 
incorporated herein by reference in its entirety. 

In another approach, the genome-derived single 

25 exon probes and microarrays of the present invention can be 
used to interrogate genomic DNA, rather than pools of 
expressed message; this latter approach permits 
predisposition to and/or prognosis of heart or vascular 
disease to be assessed through the massively parallel 

30 determination of altered copy number, deletion, or mutation 
in the patient's genome of exons known to be expressed in 
human heart. The algorithms set forth in WO 99/58720 can 
be applied to such genomic profiles without regard to the 
function of the protein encoded by the interrogated gene. 

35 The utility is specific to the probe; at 
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sufficiently high hybridization stringency, which 
stringencies are well known in the art — see Ausubel et al. 
and Maniatis et al . — each probe reports the level of 
expression of message specifically containing that ORF. 
5 It should be appreciated, however, that the 

probes of the present invention, for which expression in 
the heart has been demonstrated are useful for both 
measurement in the heart and for survey of expression in 
other tissues. 

10 Significant among such advantages is the presence 

of probes for novel genes. 

As mentioned above and further detailed in 
Examples 1 and 2, the methods described enable ORFs which 
are not present in existing expression databases to be 

15 identified. And the fewer the number of tissues in which 
the ORF can be shown to be expressed, the more likely the 
ORF will prove to be part of a novel gene: as further 
discussed in Example 2, ORFs whose expression was 
measurable in only a single of the tested tissues were 

20 represented in existing expression databases at a rate of 
only 11%, whereas 36% of ORFs whose expression was 
measurable in 9 tissues were present in existing expression 
databases, and fully 45% of those ORFs expressed in all ten 
tested tissues were present in existing expressed sequence 

25 databases. 

Either as tools for measuring gene expression or 
tools for surveying gene expression, the genome-derived 
single exon probes of the present invention have 
significant advantages over the cDNA or EST-based probes 

30 that are currently available for achieving these utilities. 

The genome-derived single exon probes of the 
present invention are useful in constructing genome-derived 
single exon microarrays; the genome-derived single exon 
microarrays, in turn, are useful devices for. measuring and 

35 for surveying gene expression in the human. 
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Gene expression analysis using microarrays — 
conventionally using microarrays having probes derived from 
expressed message — is well-established as useful in the 
biological research arts (see Lockhart et al . Nature 405, 
5 827-836) . 

Microarrays have been used to determine gene 
expression profiles in cells in response to drug treatment 
(see, for example, Kaminski et al . , "Global Analysis of 
Gene Expression in Pulmonary Fibrosis Reveals Distinct 

10 Programs Regulating Lung Inflammation and Fibrosis," Proc. 
Natl. Acad. Sci. USA 97 (4 ): 1778-83 (2000); Bartosiewicz et 
al., "Development of a Toxicological Gene Array and 
Quantitative Assessment of This Technology," Arch. Biochem. 
Biophys. 376(1): 66-73 (2000)), viral infection (see for 

15 example, Geiss et al., "Large-scale Monitoring of Host Cell 
Gene Expression During HIV-1 Infection Using cDNA 
Microarrays," Virology 266 (1) : 8-16 (2000)) and during cell 
processes such as differentiation, senescence and apoptosis 
(see, for example, Shelton et al . , "Microarray Analysis of 

20 Replicative Senescence," Curr. Biol. 9(17): 939-45 (1999); 
Voehringer et al., "Gene Microarray Identification of Redox 
and Mitochondrial Elements That Control Resistance or 
Sensitivity to Apoptosis," Proc. Natl. Acad. Sci. USA 
97 (6) :2680-5 (2000) ) . 

25 Microarrays have also been used to determine 

abnormal gene expression in diseased tissues (see, for 
example, Alon et al . , "Broad Patterns of Gene Expression 
Revealed by Clustering Analysis of Tumor and Normal Colon 
Tissues Probed by Oligonucleotide Arrays," Proc. Natl. 

30 Acad. Sci. USA 96 ( 12 ): 6745-50 (1999); Perou et al . , 

"Distinctive Gene Expression Patterns in Human Mammary 
Epithelial Cells and Breast Cancers, Proc. Natl. Acad. Sci. 
USA 96(16) : 9212-7 (1999); Wang et al. , "Identification of 
Genes Differentially Over-expressed in Lung Squamous Cell 

35 Carcinoma Using Combination of cDNA Subtraction and 
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Microarray Analysis, " Oncogene 19 ( 12 ): 1519-28 (2000); 
Whitney et al . , "Analysis of Gene Expression in Multiple 
Sclerosis Lesions Using cDNA Microarrays, " Ann. Neurol. 
46(3): 425-8 (1999)), in drug discovery screens (see, for 
5 example, Scherf et al . , "A Gene Expression Database for the 
Molecular Pharmacology of Cancer," Nat. Genet. 24(3) : 236-44 
(2000)) and in diagnosis to determine appropriate treatment 
strategies (see, for example, Sgroi et al., "In vivo Gene 
Expression Profile Analysis of Human Breast Cancer 

10 Progression," Cancer Res. 59 (22 ): 5656-61 (1999)). 

In microarray-based gene expression screens of 
pharmacological drug candidates upon cells, each probe 
provides specific useful data. In particular, it should be 
appreciated that even those probes that show no change in 

15 expression are as informative as those that do change, 
serving, in essence, as negative controls. 

For example, where gene expression analysis is 
used to assess toxicity of chemical agents on cells, the 
failure of the agent to change a gene's expression level is 

20 evidence that the drug likely does not affect the pathway 
of which the gene's expressed protein is a part. 
Analogously, where gene expression analysis is used to 
assess side effects of pharmacological agents — whether in 
lead compound discovery or in subsequent screening of lead 

25 compound derivatives - the inability of the agent to alter 
a gene's expression level is evidence that the drug does 
not affect the pathway of which the gene's expressed 
protein is a part. 

WO 99/58720 provides methods for quantifying the 

30 relatedness of a first and second gene expression profile 
and for ordering the relatedness of a plurality of gene 
expression profiles. The methods so described permit 
useful information to be extracted from a greater 
percentage of the individual gene expression measurements 

35 from a microarray than methods previously used in the art. 
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Other uses of microarrays are described in 
Gerhold et al., Trends Biochem. Sci . 24 (5) : 168-173 (1999) 
and Zweiger, Trends Biotechnol. 17 (11) : 429-436 (1999); 
Schena et al . 

5 The invention particularly provides genome- 

derived single-exon probes known to be expressed in heart. 

The individual single exon probes can be provided 
in the form of substantially isolated and purified nucleic 
acid, typically, but not necessarily, in a quantity 

10 sufficient to perform a hybridization reaction. 

Such nucleic acid can be in any form directly 
hybridizable to the message that contains the probe's ORF, 
such as double stranded DNA, single-stranded DNA 
complementary to the message, single-stranded RNA 

15 complementary to the message, or chimeric DNA /RNA molecules 
so hybridizable. The nucleic acid can alternatively or 
additionally include either nonnative nucleotides, 
alternative internucleotide linkages, or both, so long as 
complementary binding can be obtained. For example, probes 

20 can include phosphorothioates, methylphosphonates, 

morpholino analogs, and peptide nucleic acids (PNA) , as are 
described, for example, in U.S. Patent Nos. 5,142,047; 
5,235,033; 5,166,315; 5,217,866; 5,184,444; 5,861,250. 

Usefully, however, such probes are provided in a 

25 form and quantity suitable for amplification, where the 
amplified product is thereafter to be used in the 
hybridization reactions that probe gene expression. 
Typically, such probes are provided in a form and quantity 
suitable for amplification by PCR or by other well known 

30 amplification technique. One such technique additional to 
PCR is rolling circle amplification, as is described, inter 
alia, in U.S. Patent Nos. 5,854,033 and 5,714,320 and 
international patent publications WO 97/19193 and 
WO 00/15779. As is well understood, where the probes are 

35 to be provided in a form suitable for amplification, the 
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range of nucleic acid analogues and/or internucleotide 
linkages will be constrained by the requirements and nature 
of the amplification enzyme. 

Where the probe is to be provided in form 
5 suitable for amplification, the quantity need not be 
sufficient for direct hybridization for gene expression 
analysis, and need be sufficient only to function as an 
amplification template, typically at least about 1, ( 10 or 
100 pg or more. 

10 Each discrete amplifiable probe can also be 

packaged with amplification primers, either in a single 
composition that comprises probe template and primers, or 
in a kit that comprises such primers separately packaged 
therefrom. As earlier mentioned, the ORF-specific 

15 5' primers used for genomic amplification can have a first 
common sequence added thereto, and the ORF-specific 3' 
primers used for genomic amplification can have a second, 
different, common sequence added thereto, thus permitting, 
in this embodiment, the use of a single set of 5' and 3' 

20 primers to amplify any one of the probes. The probe 

composition and/or kit can also include buffers, enzyme, 
etc., required to effect amplification. 

As mentioned earlier, when intended for use on a 
genome-derived single exon microarray of the present 

25 invention, the genome-derived single exon probes of the 
present invention will typically average at least about 
100, 200, 300, 400 or 500 bp in length, including (and 
typically, but not necessarily centered about) the ORF. 
Furthermore, when intended for use on a genome-derived 

30 single exon microarray of the present invention, the 

genome-derived single exon probes of the present invention 
will typically not contain a detectable label. 

When intended for use in solution phase 
hybridization, however — that is, for use in a 

35 hybridization reaction in which the probe is not first 
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bound to a support substrate (although the target may 
indeed be so bound) — length constraints that are imposed 
in microarray-based hybridization approaches will be 
relaxed, and such probes will typically be labeled. 
5 In such case, the only functional constraint that 

dictates the minimum size of such probe is that each such 
probe must be capable of specifically identifying in a 
hybridization reaction the exon from which it is drawn. In 
theory, a probe of as little as 17 nucleotides is capable 

10 of uniquely identifying its cognate sequence in the human 
genome. For hybridization to expressed message - a subset 
of target sequence that is much reduced in complexity as 
compared to genomic sequence — even fewer nucleotides are 
required for specificity. 

15 Therefore, the probes of the present invention 

can include as few as 20, 25 or 50 bp or ORF, or more. In 
particular embodiments, the ORF sequences are given in SEQ 
ID NOS. 9,981 - 19,771, respectively, for probe SEQ ID NOS. 
1 - 9,980. The minimum amount of ORF required to be 

20 included in the probe of the present invention in order to 
provide specific signal in either solution phase or 
microarray-based hybridizations can readily be determined 
for each of ORF SEQ ID NOS. 9,981 - 19,771 individually by 
routine experimentation using standard high stringency 

25 conditions. 

Such high stringency conditions are described, 
inter alia, in Ausubel et al. and Maniatis et al. For 
microarray-based hybridization, standard high stringency 
conditions can usefully be 50% formamide, 5X SSC, 0.2 ug/ul 

30 poly(dA), 0.2 ug/ul human c D tl DNA, and 0.5 % SDS, in a 

humid oven at 42°C overnight, followed by successive washes 
of the microarray in IX SSC, 0.2% SDS at 55°C for 5 
minutes, and then 0 . IX SSC, 0.2% SDS, at 55°C for 20 
minutes. For solution phase hybridization, standard high 

35 stringency conditions can usefully be aqueous hybridization 
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at 65°C in 6X SSC. Lower stringency conditions, suitable 
for cross-hybridization to mRNA encoding structurally- and 
functionally-related proteins, can usefully be the same as 
the high stringency conditions but with reduction in 
5 temperature for hybridization and washing to room 
temperature (approximately 25°C) . 

When intended for use in solution phase 
hybridization, the maximum size of the single exon probes 
of the present invention is dictated by the proximity of 
10 other expressed exons in genomic DNA: although each single 
exon probe can include intergenic and/or intronic material 
contiguous to the ORF in the human genome, each probe of 
the present invention will include portions of only one - 
expressed exon. 

15 Thus, each single exon probe will include no more 

than about 25 kb of contiguous genomic sequence, more 
typically no more than about 20 kb of contiguous genomic 
sequence, more usually no more than about 15 kb, even more 
usually no more than about 10 kb. Usually, probes that are 

20 maximally about 5 kb will be used, more typically no more 
than about 3 kb. 

It will be appreciated that the Sequence Listing 
appended hereto presents, by convention, only that strand 
of the probe and ORF sequence that can be directly 

25 translated reading from 5' to 3' end. As would be well 
understood by one of skill in the art, single stranded 
probes must be complementary in sequence to the ORF as 
present in an mRNA; it is well within the skill in the art 
to determine such complementary sequence. It will further 

30 be understood that double stranded probes can be used in 
both solution-phase hybridization and microarray-based 
hybridization if suitably denatured. 

Thus, it is an aspect of the present invention to 
provide single-stranded nucleic acid probes that have 

35 sequence complementary to those described herein above and 

73 



WO 01/57274 PCT/US01/00666 

below, and double-stranded probes one strand of which has 
sequence complementary to the probes described herein. 

The probes can, but need not, contain intergenic 
and/or intronic material that flanks the ORF, on one or 
5 both sides, in the same linear relationship to the ORF that 
the intergenic and/or intronic material bears to the ORF in 
genomic DNA. The probes do not, however, contain nucleic 
acid derived from more than one expressed ORF. 

And when intended for use in solution 

10 hybridization, the probes of the present invention can 

usefully have detectable labels. Nucleic acid labels are 
well known in the art, and include,' inter alia, radioactive 
labels, such as 3 H, 32 P, 33 P, 35 S, 125 I, 131 I; fluorescent 
labels, such as Cy3, Cy5, Cy5.5, Cy7, SYBR® 

15 Green and other labels described in Haugland, 

Handbook of Fluorescent Probes and Research Chemicals, 7th 
ed., Molecular Probes Inc., Eugene, OR (2000), or 
fluorescence resonance energy transfer tandem conjugates 
thereof; labels suitable for chemi luminescent and/or 

20 enhanced chemiluminescent detection; labels suitable for 
ESR and NMR detection; and labels that include one member 
of a specific binding pair, such as biotin, digoxigenin, or 
the like. 

The probes, either in quantity sufficient for 
25 hybridization or sufficient for amplification, can be 
provided in individual vials or containers. 

Alternatively, such probes can usefully be 
packaged as a plurality of such individual genome-derived 
single exon probes. 
30 When provided as a collection of plural 

individual probes, the probes are typically made available 
in amplifiable form in a spatially-addressable ordered set, 
typically one per well of a microtiter dish. Although a 96 
well microtiter plate can be used, greater efficiency is 
35 obtained using higher density arrays. 
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If, as earlier mentioned, the ORF-specific 
5' primers used for genomic amplification had a first 
common sequence added thereto, and the ORF-specific 3' 
primers used for genomic amplification had a second, 
5 different, common sequence added thereto, a single set of 
5' and 3' primers can be used to amplify all of the probes 
from the amplifiable ordered set. 

Such collections of genome-derived single exon 
probes can usefully include a plurality of probes chosen 
10 for the common attribute of expression in the human heart. 

In such defined subsets, typically at least 50, 
60, 75, 80, 85, 90 or 95% or more of the probes will be 
chosen by their expression in the defined tissue or cell 
type. 

15 The single exon probes of the present invention, 

as well as fragments of the single exon probes comprising 
selectively hybridizable portions of the probe ORF, can be 
used to obtain the full length cDNA that includes the ORF 
by (i) screening of cDNA libraries; (ii) rapid 

20 amplification of cDNA ends ("RACE"); or (iii) other 
conventional means, as are described, inter alia, in 
Ausubel et al. and Maniatis et al. 

It is another aspect of the present invention to 
provide genome-derived single exon nucleic acid microarrays 

25 useful for gene expression analysis, where the term 
"microarray" has the meaning given in the definitional 
section of this description, supra. ■ 

The invention particularly, provides genome- 
derived single-exon nucleic acid microarrays comprising a 

30 plurality of probes known to be expressed in human heart. 
In preferred embodiments, the present invention provides 
human genome-derived single exon microarrays comprising a 
plurality of probes drawn from the group consisting of SEQ 
ID NOS . : 1 - 9, 980. 

35 When used for gene expression analysis, the 
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genome-derived single exon microarrays provide greater 
physical informational density than do the genome-derived 
single exon microarrays that have lower percentages of 
probes known to be expressed commonly in the tested tissue. 
5 At a fixed probe density, for example, a given microarray 
surface area of the defined subset genome-derived single 
exon microarray can yield a greater number of expression 
measurements. Alternatively, at a given probe density, the 
same number of expression measurements can be obtained from 

10 a smaller substrate surface area. Alternatively, at a 

fixed probe density and fixed surface area, probes can be 
provided redundantly, providing greater reliability in 
signal measurement for any given probe. Furthermore, with 
a higher percentage of probes known to be expressed in the 

15 assayed tissue, the dynamic range of the detection means 

can be adjusted to reveal finer levels discrimination among 
the levels of expression. 

Although particularly described with respect to 
their utility as probes of gene expression, particularly as 

20 probes to be included on a genome-derived single exon 

microarray, each of the nucleic acids having SEQ ID NOS . : 1 
- 9,980 contains an open-reading frame, set forth 
respectively in SEQ ID NOS.: 9,981 - 19,771, that encodes a 
protein domain. Thus, each of SEQ ID NOS. 1 - 9,980 can be 

25 used, or that portion thereof in SEQ ID NOS. 9,981 - 19,771 
used, to express a protein domain by standard in vitro 
recombinant techniques. See Ausubel et al. and Maniatis et 
al. 

Additionally, kits are available commercially 
30 that readily permit such nucleic acids to be expressed as 
protein in bacterial cells, insect cells, or mammalian 
cells, as desired (e.g., HAT™ Protein Expression & 
Purification System, ClonTech Laboratories, Palo Alto, CA; 
Adeno-X™ Expression System, ClonTech Laboratories, Palo 
35 Alto, CA; Protein Fusion & Purification (pMAL™) System, New 
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England Biolabs, Beverley, MA) 

Furthermore, shorter peptides can be chemically 
synthesized using commercial peptide synthesizing equipment 
and well known techniques. Procedures are described, inter 
5 alia, in Chan et al. (eds.), Fmoc Solid Phase Peptide 

Synthesis: A Practical Approach (Practical Approach Series, 
(Paper)), Oxford Univ. Press (March 2000) (ISBN: 
0199637245); Jones, Amino Acid and Peptide Synthesis 
(Oxford Chemistry Primers, No 7 ) , Oxford Univ. Press 

10 (August 1992) (ISBN: 0198556683); and Bodanszky, Principles 
of Peptide Synthesis (Springer Laboratory) , Springer Verlag 
(December 1993) (ISBN: 0387564314). 

It is, therefore, another aspect of the invention 
to provide peptides comprising an amino acid sequence 

15 translated from SEQ ID NOS . : 9,981 - 19,771. Such amino 
acid sequences are set out in SEQ ID NOS: 19,772 - 29,119. 
Any such recombinantly-expressed or synthesized peptide of 
at least 8, and preferably at least about 15, amino acids, 
can be conjugated to a carrier protein and used to generate 

20 antibody that recognizes the peptide. Thus, it is a 

further aspect of the invention to provide peptides that 
have at least 8, preferably at least 15, consecutive amino 
acids . 



25 The following examples are offered by way of 

illustration and not by way of limitation. 

EXAMPLE 1 

Preparation of Single Exon Microarrays from ORFs Predicted 
30 in Human Genomic Sequence 

Bioinf ormatics Results 

All human BAC sequences in fewer than 10 pieces 
that had been accessioned in a five month period 
35 immediately preceding this study were downloaded from 
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GenBank. This corresponds to -2200 clones, totaling -350 
MB of sequence, or approximately 10% of the human genome. 

After masking repetitive elements using the 
program CROSS_MATCH, the sequence was analyzed for open 
5 reading frames using three separate gene finding programs. 
The three programs predict genes using independent 
algorithmic methods developed on independent training sets: 
GRAIL uses a neural network, GENEFINDER uses a hidden 
Markoff model, and DICTION, a program proprietary to 
10 Genetics Institute, operates according to a different 

heuristic. The results of all three programs were used to 
create a prediction matrix across the segment of genomic 
DNA. 

The three gene finding programs yielded a range 
15 of results. GRAIL identified the greatest percentage of 
genomic sequence as putative coding region, 2% of the data 
analyzed. GENEFINDER was second, calling 1%, and DICTION 
yielded the least putative coding region, with 0.8% of 
genomic sequence called as coding region. 
20 The consensus data were as follows. GRAIL and 

GENEFINDER agreed on 0.7% of genomic sequence, GRAIL and 
DICTION agreed on 0.5% of genomic sequence, and the three 
programs together agreed on 0.25% of the data analyzed. 
That is, 0.25% of the genomic sequence was identified by 
25 all three of the programs as containing putative coding 
region. 

ORFs predicted by any two of the three programs 
("consensus ORFs") were assorted into "gene bins" using two 
criteria: (1) any 7 consecutive exons within a 25 kb window 
30 were placed together in a bin as likely contributing to a 
single gene, and (2) all ORFs within a 25 kb window were 
placed together in a bin as likely contributing to a single 
gene if fewer than 7 exons were found within the 25 kb 
window. 



78 



WO 01/57274 PCT/US01/00666 

PCR ■ 

The largest ORF from each gene bin that did not 
span repetitive sequence was then chosen for amplification, 
as were all consensus ORFs longer than 500 bp. This method 
5 approximated one exon per gene; however, a number of genes 
were found to be represented by multiple elements. 

Previously, we had determined that DNA fragments 
fewer than 250 bp in length do not bind well to the amino- 
modified glass surface of the slides used as support 
10 substrate for construction of microarrays ; therefore, 
amplicons were designed in the present experiments to 
approximate 500 bp in length. 

Accordingly, after selecting the largest ORF per 
gene bin, a 500 bp fragment of sequence centered on the ORF 
15 was passed to the primer picking software, PRIMER3 
(available online for use at 

http://www-genome.wi.mit.edu/cgi-bin/primer/ ). A first 
additional sequence was commonly added to each ORF-unique 
5' primer, and a second, different, additional sequence was 

20 commonly added to each ORF-unique 3' primer, to permit 

subsequent reamplif ication of the amplicon using a single 
set of "universal" 5' and 3' primers, thus immortalizing 
the amplicon. The addition of universal priming sequences 
also facilitates sequence verification, and can be used to 

25 add a cloning site should some ORFs be found to warrant 
further study. 

The ORFs were then PCR amplified from genomic 
DNA, verified on agarose gels, and sequenced using the 
universal primers to validate the identity of the amplicon 

30 to be spotted in the microarray. 

Primers were supplied by Operon Technologies 
(Alameda, CA) . PCR amplification was performed by standard 
techniques using human genomic DNA (Clontech, Palo Alto, 
CA) as template. Each PCR product was verified by SYBR® 

35 green (Molecular Probes, Inc., Eugene, OR) staining of 
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agarose gels, with subsequent imaging by Fluorimager 
(Molecular Dynamics, Inc., Sunnyvale, CA) . PCR 
amplification was classified as successful if a single band 
appeared. 

5 The success rate for amplifying ORFs of interest 

directly from genomic DNA using PCR was approximately 75%. 
FIG. 5 graphs the distribution of predicted ORF (exon) 
length and distribution of amplified PCR products, with ORF 
length shown in red and PCR product length shown in blue 

10 (which may appear black in the figure) . Although the range 
of ORF sizes is readily seen to extend to beyond 900 bp, 
the mean predicted exon size was only 229 bp, with a median 
size of 150 bp (n=9498) . With an average amplicon size of 
475 ± 25 bp, approximately 50% of the average PCR 

15 amplification product contained predicted coding region, 
with the remaining 50% of the amplicon containing either 
intron, intergenic sequence, or both. 

Using a strategy predicated on amplifying about 
500 bp, it was found that long exons had a higher PCR 

20 failure rate. To address this, the bioinf ormatics process 
was adjusted to amplify 1000, 1500 or 2000 bp fragments 
from exons larger than 500 bp. This improved the rate of 
successful amplification of exons exceeding 500 bp, 
constituting about 9.2% of the exons predicted by the gene 

25 finding algorithms. 

Approximately 75% of the probes disposed on the 
array (90% of those that successfully PCR amplified) were 
sequence-verified by sequencing in both the forward and 
reverse direction using MegaBACE sequencer (Molecular 

30 Dynamics, Inc., Sunnyvale, CA) , universal primers, and 
standard protocols. 

Some genomic clones (BACs) yielded very poor PCR 
and sequencing results. The reasons for this are unclear, 
but may be related to the quality of early draft sequence 

35 or the inclusion of vector and host contamination in some 
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submitted sequence data. 

Although the intronic and intergenic material 
flanking coding regions could theoretically interfere with 
hybridization during microarray experiments, subsequent 
5 empirical results demonstrated that differential expression 
ratios were not significantly affected by the presence of 
noncoding sequence. The variation in exon size was 
similarly found not to affect differential expression 
ratios significantly; however, variation in exon size was 
10 observed to affect the absolute signal intensity (data not 
shown) . 

The 350 MB of genomic DNA was, by the above- 
described process, reduced to 9750 discrete probes, which 
were spotted in duplicate onto glass slides using 

15 commercially available instrumentation (MicroArray Genii 
Spotter and/or MicroArray Genlll Spotter, Molecular 
Dynamics, Inc., Sunnyvale, CA) . Each slide additionally 
included either 16 or 32 E. coli genes, the average 
hybridization signal of which was used as a measure of 

20 background biological noise. 

Each of the probe sequences was BLASTed against 
the human EST data set, the NR data set, and SwissProt 
GenBank (May 7, 1999 release 2.0.9). 

One third of the probe sequences (as amplified) 

25 produced an exact match (BLAST Expect ("E") values less 

than 1 e" 100 ) to either an EST (20% of sequences) or a known 
mRNA (13% of sequences) . A further 22% of the probe 
sequences showed some homology to a known EST or mRNA 
(BLAST E values from 1 e~ 5 to 1 e~") . The remaining 45% of 

30 the probe sequences showed no significant sequence homology 
to any expressed, or potentially expressed, sequences 
present in public databases. 

All of the probe sequences (as amplified) were 
then analyzed for protein similarities with the SwissProt 

35 database using BLASTX, Gish et al., Nature Genet. 3:266 

81 



WO 01/57274 PCT/US01/00666 

(1993) . The predicted functional breakdowns of the 2/3 of 
probes identical or homologous to known sequences are 
presented in Table 1. 



Table 1 



Function of Predicted ORFs As Deduced From Comparative 
Sequence Analysis 

Total V6 chip V7 chip Function Predicted from 
Comparative Sequence 
Analysis 


211 


96 


115 


Receptor 


120 


43 


77 


Zinc Finger 


30 


11 


19 


Homeobox 


25 


9 


16 


Transcription Factor 


17 


11 


7 


Transcription 


118 


57 


61 


Structural 


95 


39 


56 


Kinase 


36 


18 


18 


Phosphatase 


83 


31 


52 


Ribosomal 


45 


19 


26 


Transport 


21 


17 


14 


Growth Factor 


17 


12 


5 


Cytochrome 


50 


33 


17 


Channel 



As can be seen, the two most common types of 
genes were transcription factors and receptors, making up 
2.2% and 1.8% of the arrayed elements, respectively. 

EXAMPLE 2 

Gene Expression Measurements From Genome-Derived Single 
Exon Microarrays 
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The two genome-derived single exon microarrays 
prepared according to Example 1 were hybridized in a series 
of simultaneous two-color fluorescence experiments to (1) 
5 Cy3-labeled cDNA synthesized from message drawn 

individually from each of brain, heart, liver, fetal liver, 
placenta, lung, bone marrow, HeLa, BT 474, or HBL 100 
cells, and (2) Cy5-labeled cDNA prepared from message 
pooled from all ten tissues and cell types, as a control in 
10 each of the measurements. Hybridization and scanning were 
carried out using standard protocols and Molecular Dynamics 
equipment . 

Briefly, mRNA samples were bought from commercial 
sources (Clontech, Palo Alto, CA and Amersham Pharmacia 

15 Biotech (APB) ) . Cy3-dCTP and Cy5-dCTP (both from APB) were 
incorporated during separate reverse transcriptions of -1 ug 
of polyA + mRNA performed using 1 ug oligo (dT) 12-18 primer 
and 2 ug random 9mer primers as follows. After heating to 
70°C, the RNA:primer mixture was snap cooled on ice. After 

20 snap cooling on ice, added to the RNA to the stated final 
concentration was: IX Superscript II buffer, 0.01 M DTT , 
lOOuM dATP, 100 uM dGTP, 100 uM dTTP, 50 uM dCTP, 50 uM 
Cy3-dCTP or Cy5-dCTP 50 uM, and 200 U Superscript II 
enzyme. The reaction was incubated for 2 hours at 42°C. 

25 After 2 hours, the first strand cDNA was isolated by adding 
1 U Ribonuclease H, and incubating for 30 minutes at 37°C. 
The reaction was then purified using a Qiagen PCR cleanup 
column, increasing the number of ethanol washes to 5. 
Probe was eluted using 10 mM Tris pH 8.5. 

30 • Using a spectrophotometer, probes were measured 

for dye incorporation. Volumes of both Cy3 and Cy5 cDNA 
corresponding to 50 pmoles of each dye were then dried in a 
Speedvac, resuspended in 30 ul hybridization solution 
containing 50% formamide, 5X SSC, 0.2 ug/ul poly(dA), 0.2 

35 ug/ul human c 0 tl DNA, and 0.5 % SDS. 
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Hybridizations were carried out under a 
coverslip, with the array placed in a humid oven at 42°C 
overnight. Before scanning, slides were washed in IX SSC, 
0.2% SDS at 55°C for 5 minutes, followed by 0 . IX SSC, 0.2% 
5 SDS, at 55°C for 20 minutes. Slides were briefly dipped in 
water and dried thoroughly under a gentle stream of 
nitrogen. 

Slides were scanned using a Molecular Dynamics 
Gen3 scanner, as described. Schena (ed.), Microarray 

10 Biochip; Tools and Technology , Eaton Publishing 
Company/BioTechniques Books Division (2000) (ISBN: 
1881299376) . 

Although the use of pooled cDNA as a reference 
permitted the survey of a large number of tissues, it 

15 attenuates the measurement of relative gene expression, 

since every highly expressed gene in the tissue/cell type- 
specific fluorescence channel will be present to a level of 
at least 10% in the control channel. Because of this fact, 
both signal and expression ratios (the latter hereinafter, 

20 "expression" or "relative expression") for each probe were 
normalized using the average ratio or average signal, 
respectively, as measured across the whole slide. 

Data were accepted for further analysis only when 
signal was at least three times greater than biological 

25 noise, the latter defined by the average signal produced by 
the E. coli control genes. 

The relative expression signal for these probes 
was then plotted as function of tissue or cell type, and is 
presented in FIG. 6. 

30 FIG. 6 shows the distribution of expression 

across a panel of ten tissues. The graph shows. the number 
of sequence-verified products that were either not 
expressed ("0"), expressed in one or more but not all 
tested tissues ("1" - "9"), and expressed in all tissues 

35 tested ("10"). 
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Of 9999 arrayed elements on the two microarrays 
(including positive and negative controls and "failed" 
products), 2353 (51%) were expressed in at least one tissue 
or cell type. Of the gene elements showing significant 
5 signal — where expression was scored as "significant" if 
the normalized Cy3 signal was greater than 1, representing 
signal 5-fold over biological noise (0.2) - 39% (991) were 
expressed in all 10 tissues. The next most common class 
(15%) consisted of gene elements expressed in only a single 
10 tissue. 

The genes expressed in a single tissue were 
further analyzed, and the results of the analyses are 
compiled in FIG. 7. 

FIG. 7A is a matrix presenting the expression of 

15 all verified seguences that showed expression greater than 
3 in at least one tissue. Each clone is represented by a 
column in the matrix. Each of the 10 tissues assayed is 
represented by a separate row in the matrix, and relative 
expression of a clone in that tissue is indicated at the 

20 respective node by intensity of green shading, with the 
intensity legend shown in panel B. The top row of the 
matrix ("EST Hit") contains "bioinf ormatic" rather than 
"physical" expression data - that is, presents the results 
returned by query of EST, NR and SwissProt databases using 

25 the probe sequence. The legend for "bioinf ormatic 
expression" (i.e., degree of homology returned) is 
presented in panel C. Briefly, white is known, black is 
novel, with gray depicting nonidentical with significant 
homology (white: E values < le-100; gray: E values from le- 

30 05 to le-99; black: E values > le-05) . 

As FIG. 7 readily shows, heart and brain were 
demonstrated to have the greatest numbers of genes that 
were shown to be uniquely expressed in the respective 
tissue. In brain, 200 uniquely expressed genes were 

35 identified; in heart, 150. The remaining tissues gave the 
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following figures for uniquely expressed genes: liver, 100; 
lung, 70; fetal liver, 150; bone marrow, 75; placenta, 100; 
HeLa, 50; HBL, 100; and BT474, 50. 

It was further observed that there were many more 
5 "novel" genes among those that were up-regulated in only 
one tissue, as compared with those that were down-regulated 
in only one tissue. In fact, it was found that ORFs whose 
expression was measurable in only a single of the tested 
tissues were represented in sequencing databases at a rate 

10 of only 11%, whereas 36% of the ORFs whose expression was 
measurable in 9 of the tissues were present in public 
databases. As for those ORFs expressed in all ten tissues, 
fully 45% were present in existing expressed sequence 
databases. These results are not unexpected, since genes 

15 expressed in a greater number of tissues have a higher 

likelihood of being, and thus of having been, discovered by 
EST approaches. 

Comparison of Signal from Known and Unknown Genes 
20 The normalized signal of the genes found to have 

high homology to genes present in the GenBank human EST 
database were compared to the normalized signal of those 
genes not found in the GenBank human EST database. The 
data are shown in FIG. 8. 
25 FIG. 8 shows the normalized Cy3 signal intensity 

for all sequence-verified products with a BLAST Expect 
("E") value of greater than le-30 (designated "unknown") 
upon query of existing EST, NR and SwissProt databases, and 
shows in blue the normalized Cy3 signal intensity for all 
30 sequence-verified products with a BLAST Expect value of 

less than le-30 ("known"). Note that biological background 
noise has an averaged normalized Cy3 signal intensity of 
0.2. 

As expected, the most highly expressed of the 
35 ORFs were "known" genes. This is not surprising, since 
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very high signal intensity correlates with very commonly- 
expressed genes, which have a higher likelihood of being 
found by EST sequence. 

However, a significant point is that a large 
5 number of even the high expressers were "unknown". Since 
the genomic approach used to identify genes and to confirm • 
their expression does not bias exons toward either the 3' 
or 5' end of a gene, many of these high expression genes 
will not have been detected in an end-sequenced cDNA 
10 library. 

The significant point is that presence of the 
gene in an EST database is not a prerequisite for 
incorporation into a genome-derived microarray, and 
further, that arraying such "unknown" exons can help to 
15 assign function to as-yet undiscovered genes. 

Verification of Gene Expression 

To ascertain the validity of the approach 
described above to identify genes from raw genomic 

20 sequence, expression of two of the probes was assayed using 
reverse transcriptase polymerase chain reaction (RT PCR) 
and northern blot analysis. 

Two microarray probes were selected on the basis 
of exon size, prior sequencing success, and tissue-specific 

25 gene expression patterns as measured by the microarray 
experiments. The primers originally used to amplify the 
two respective ORFs from genomic DNA were used in RT PCR 
against a panel of tissue-specific cDNAs (Rapid-Scan gene 
expression panel 24 human cDNAs ) (OriGene Technologies, 

30 Inc., Rockville, MD) . 

Sequence AL079300_1 was shown by microarray 
hybridization to be present in cardiac tissue, and sequence 
AL031734_1 was shown by microarray experiment to be present 
in placental tissue (data not shown) . RT-PCR on these two 

35 sequences confirmed the tissue-specific gene expression as 
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measured by microarrays, as ascertained by the presence of 
a correctly sized PCR product from the respective tissue 
type cDNAs . 

Clearly, all microarray results cannot, and 
5 indeed should not, be confirmed by independent assay 

methods, or the high throughput, highly parallel advantages 
of microarray hybridization assays will be lost. However, 
in addition to the two RT-PCR results presented above, the 
observation that 1/3 of the arrayed genes exist in 

10 expression databases provides powerful confirmation of the 
power of our methodology - which combines bioinf ormatic 
prediction with expression confirmation using genome- 
derived single exon microarrays - to identify novel genes 
from raw genomic data. 

15 To verify that the approach further provides 

correct characterization of the expression patterns of the 
identified genes, a detailed analysis was performed of the 
microarrayed sequences that showed high signal in brain. 

For this latter analysis, sequences that showed 

20 high (normalized) signal in brain, but which showed very 
low (normalized) signal (less than 0.5, determined to be 
biological noise) in all other tissues, were further 
studied. There were 82 sequences that fit these criteria, 
approximately 2% of the arrayed elements. The 10 sequences 

25 showing the highest signal in brain in microarray 

hybridizations are detailed in Table 2, along with assigned 
function, if known or reasonably predicted. 

Table 2 

Function Qf the Most Highly 
Expressed Genes Expressed Only in Brain 



88 



Microarray Normal Expressi Homology Gene Function 
Sequence ized on Ratio to EST as described by 
Name Signal present GenBank 
in 

GenBank 


AP000217-1 


5 . 2 


+7 . 7 


High 


S-100 protein, 
b-chain, Ca 2+ 
binding protein 
expressed in 
central nervous 
system 


APUUUU4 1-1 


2 . 3 




High 


Unknown 
Function 


AC006548-9 


1.7 




High 


Similar to 
mouse membrane 
glyco-protein 
M6, expressed 
in central 
nervous system 


AC007245-5 


1.5 




High 


Similar to 
amphiphysin, a e 
synaptic 
ves i c le~ 
associated 
protein. Ref 21 


L44140-4 


1.2 


+2.0 


High 


Endothelial 
act in-binding 
protein found 
in nonmuscle 
f ilamin 
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AC004689-9 


1.2 


+3.5 


High 


Protein 
Phosphatase 
PP2A, neuronal/ 
downregulates 
activated 
protein kinases 


AL031657-1 


1.2 


+3.0 


High 


Unknown 
function/ 
Contains the 
anhyrin motif, 
a common 
protein 
sequence motif 


AC009266-2 


1.1 


+3.7 


Low 


Low homology to 
the 

Synaptotagmin I 
protein in 
rat/present at 
low levels 
throughout rat 
brain 


AP000086-1 


1.0 


+2.7 


Low 


Unknown, very 
poor homology 
to collagen 


AC004689-3 


1.0 




High 


Protein 
Phosphatase 
PP2A, neuronal/ 
downregulates 
activated 
protein kinases 



Of the ten sequences studied by these latter 
confirmatory approaches, eight were previously known. Of 
these eight, six had previously been reported to be 
important in the central nervous system or brain. The exon 
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giving the highest signal (AP00217-1) was found to be the 
gene encoding an S100B Ca 2+ binding protein, reported in 
the literature to be highly and uniquely expressed in the 
central nervous system. Heizmann, Neurochem. Res. 9:1097 
5 (1997). 

A number of the brain-specific probe sequences 
(including AC006548-9, AC009266-2) did not have homology to 
any known human cDNAs in GenBank but did show homology to 
rat and mouse cDNAs . Sequences AC004689-9 and AC004689-3 

10 were both found to be phosphatases present in neurons 
(Millward et al., Trends Biochem. Sci. 24 (5) : 186-191 
(1999)). Two microarray sequences, AP000047-1 and 
AP000086-1 have unknown function, with AP000086-1 being 
absent from GenBank. Functionality can now be narrowed 

15 down to a role in the central nervous system for both of 
these genes, showing the power of designing microarrays in 
this fashion. 

Next, the function of the chip sequences with the 
highest (normalized) signal intensity in brain, regardless 

20 of expression in other tissues, was assessed. In this 
latter analysis, we found expression of many more common 
genes, since the sequences were not limited to those 
expressed only in brain. For example, looking at the 20 
highest signal intensity spots in brain, 4 were similar to 

25 tubulin (AC00807905; AF146191-2; AC007664-4; AF14191-2), 2 
were similar to actin (AL035701-2 ; AL034402-1), and 6 were 
found to be homologous to glyceraldehyde-3-phosphate 
dehydrogenase (GAPDH) (AL035604-1 ; Z86090-1; AC006064-L, 
AC006064-K; AC035604-3; AC006064-L) . These genes are often 

30 used as controls or housekeeping genes in microarray 
experiments of all types. 

Other interesting genes highly expressed in brain 
were a ferritin heavy chain protein, which is reported in 
the literature to be found in brain and liver (Joshi et 

35 al., J. Neurol. Sci. 134 (Suppl) : 52-56 (1995)), a result 

91 



WO 01/57274 PCT/US01/00666 

duplicated with the array. Other highly expressed chip 
sequences included a translation elongation factor ID 

(AC007564-4) , a DEAD-box homolog (AL023804-4 ) , and a Y- 
chromosome RNA-binding motif (Chai et al . , Genomics 
5 49 (2):283-89 ( 1998 ) ) (AC007320-3 ) . A low homology analog 

(AP00123-1/2) to a gene, DSCR1, thought to be involved in 
trisomy 21 (Down's syndrome), showed high expression in 
both brain and heart, in agreement with the literature 

(Fuentes et al., Mol. Genet. 4 (10) : 1935-44 (1995)). 
10 As a further validation of the approach, we 

selected the BAC AC006064 to be included on the array. 
This BAC was known to contain the GAPDH gene, and thus 
could be used as a control for the ORF selection process. 
The gene finding and exon selection algorithms resulted in 
15 choosing 25 exons from BAC AC006064 for spotting onto the 
array, of which four were drawn from the GAPDH gene. Table 
3 shows the comparison of the average expression ratio for 
the 4 exons from BAC006064 compared with the average 
expression ratio for 5 different dilutions of a 
20 commercially available GAPDH cDNA (Clontech) . 



Table 3 



Comparison of Expression Ratio, for each 
tissue, of GAPDH 




AC006064 (n = 4) 


Control ( n = 5) 


Bone Marrow 


-1.81 + 0.11 


-1.85 ± 0.08 


Brain 


-1.41 ± 0.11 


-1.17 ± 0.05 


BT474 


1.85 ± 0.09 


1.66 + 0.12 


Fetal Liver 


-1.62 ± 0.07 


-1.41 ± 0.05 


HBL100 


1.32 ± 0.05 


2.64 ± 0.12 


Heart 


1.16 ± 0.09 


1.56 ± 0.10 


HeLa 


1.11 ±0.06 


1.30 ± 0.15 


Liver 


-1.62 ± 0.22 


-2.07 ± 
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Lung 


-4.95 ± 0.93 


-3.75 ± 0.21 


Placenta 


-3.56 ± 0.25 


-3.52 ± 0.43 



Each tissue shows excellent agreement between the 
experimentally chosen exons and the control, again 
5 demonstrating the validity of the present exon mining 

approach. In addition, the data also show the variability 
of expression of GAPDH within tissues, calling into 
question its classification as a housekeeping gene and 
utility as a housekeeping control in microarray 
10 experiments. 

EXAMPLE 3 

Representation of Sequence and Expression Data as a 
"Mondrian" 

15 

For each genomic clone processed for microarray 
as above-described, a plethora of information was 
accumulated, including full clone sequence, probe sequence 
within the clone, results of each of the three gene finding 

20 programs, EST information associated with the probe 
sequences, and microarray signal and expression for 
multiple tissues, challenging our ability to display the 
information . 

Accordingly, we devised a new tool for visual 

25 display of the sequence with its attendant annotation 
which, in deference to its visual similarity to the 
paintings of Piet Mondrian, is hereinafter termed a 
"Mondrian". FIGS. 3 and 4 present the key to the 
information presented on a Mondrian. 

30 FIG. 9 presents a Mondrian of BAC AC008172 (bases 

25,000 to 130,000 shown), containing the carbamyl phosphate 
synthetase gene (AF154830 . 1) . Purple background within the 
region shown as field 81 in FIG. 3 indicates all 37 known 
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exons for this gene. 

As can be seen, GRAIL II successfully identified 
27 of the known exons (73%), GENEFINDER successfully 
identified 37 of the known exons (100%), while DICTION 
5 identified 7 of the known exons (19%) . 

Seven of the predicted exons were selected for 
physical assay, of which 5 successfully amplified by PGR 
and were sequenced. These five exons were all found to be 
from the same gene, the carbamyl phosphate synthetase gene 
10 (AF154830.1) . 

The five exons were arrayed, and gene expression 
measured across 10 tissues. As is readily seen in the 
Mondrian, the five chip sequences on the array show 
identical expression patterns, elegantly demonstrating the 
15 reproducibility of the system. 

FIG. 10 is a Mondrian of BAC AL049839. We 
selected 12 exons from this BAC, of which 10 successfully 
sequenced, which were found to form between 5 and 6 genes. 
Interestingly, 4 of the genes on this BAC are protease 
20 inhibitors. Again, these data elegantly show that exons 
selected from the same gene show the same expression 
patterns, depicted below the red line. From this figure, 
it is clear that our ability to find known genes is very 
good. A novel gene is also found from 86.6 kb to 88.6 kb, 
25 upon which all the exon finding programs agree. We are 
confident we have two exons from a single gene since they 
show the same expression patterns and the exons are 
proximal to each other. Backgrounds in the following 
colors indicate a known gene (top to bottom) : 
30 red = kallistatin protease inhibitor (P29622) ; 

purple = plasma serine protease inhibitor (P05154) ; 
turquoise = al anti-chymotrypsin (P01011); mauve = 40S 
ribosomal protein (P08865) . Note that chip sequence 8 and 
12 did not sequence verify. 
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EXAMPLE 4 

Genome-Derived Single Exon Probes Useful For Measuring 
Human Gene Expression 

5 

The protocols set forth in Examples 1 and 2, 
supra, were applied to additional human genomic sequence as 
it became newly available in GenBank to identify unique 
exons in the human genome that could be shown to be 
10 expressed at significant levels in heart tissue. 

These unique exons are within longer probe 
sequences. Each probe was completely sequenced on both 
strands prior to its use on a genome-derived single exon 
microarray; sequencing confirms the exact chemical 

15 structure of each probe. An added benefit of sequencing is 
that it placed us in possession of a set of single base- 
incremented fragments of the sequenced nucleic acid, 
starting from the sequencing primer 3' OH. (Since the 
single exon probes were first obtained by PCR amplification 

20 from genomic DNA, we were of course additionally in 

possession of an even larger set of single base incremented 
fragments of each of the 9,980 single exon probes, each 
fragment corresponding to an extension product from one of 
the two amplification primers.) 

25 The structures of the 9,980 unique single exon 

probes are clearly presented in the Sequence Listing as SEQ 
ID Nos.: 1 - 9,980. The 16 nt 5' primer sequence and 16 nt 
3' primer sequence present on the amplicon are not included 
in the sequence listing. The sequences of the exons 

30 present within each of these probes is presented in the 
Sequence Listing as SEQ ID Nos.: 9,981 - 19,771, 
respectively. It will be noted that some amplicons have 
more than one exon, some exons are contained in more than 
one amplicon. 

35 -As detailed in Example 2, expression was 
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demonstrated by disposing the amplicons as single exon 
probes on nucleic acid microarrays and then performing two- 
color fluorescent hybridization analysis; significant 
expression is based on a statistical confidence that the 
signal is significantly greater than negative biological 
control spots. The negative biological control is formed 
from spotted DNA sequences from a different species. Here, 
32 sequences from E.Coli were spotted in duplicate to give 
a total of 64 spots. 

For each hybridisation (each slide, each colour) 
the median value of the signal from all of the spots is 
determined. The normalised signal value is the arithmetic 
mean of the signal from duplicate spots divided by the 
population median. 

Control spots are eliminated if there is more 
that a five-fold difference between each one of the 
duplicate spots raw signals. 

The median of the signal from the remaining 
control spots is calculated and all subsequent calculations 
are done with normalised signals. 

Control spots having a signal of greater than 
median + 2.4 (the value 2.4 is roughly 12 times the 
observed standard deviation of control spot populations) 
are eliminated. Spots with such high signals are considered 
to be " outliers" . 

The mean and standard deviation of the modified 
control spot populations are calculated. 

The mean + 3x the standard deviation (mean + 
(3*SD) ) is used as the signal threshold qualifier for that 
particular hybridisation. Thus, individual thresholds are 
determined for each channel and each hybridisation. 

This means that, assuming that the data is 
distributed normally, there is a 99% confidence that any 
signal exceeding the threshold is significant. 

The probes and their expression data are 
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presented in Table 4, set forth respectively in Example 5. 
Example 5 presents the subset of probes that is 
significantly expressed in the human heart and thus 
presents the subset of probes that was recognized to be 
5 useful for measuring expression of their cognate genes in 
human heart tissue. 

The sequence of each of the exon probes 
identified by SEQ ID NOS . : 9,981 - 19,771 was individually 
used as a BLAST (or, for SWISSPROT, BLASTX) query to 
10 identify the most similar sequence in each of dbEST, 

SwissProt (BLASTX) , and NR divisions of GenBank. Because 
the query sequences are themselves derived from genomic 
sequence in GenBank, only nongenomic hits from NR were 
scored. 

15 The smallest in value of the BLAST (or BLASTX) 

expect ("E") scores for each query sequence across the 
three database divisions was used as a measure of the 
"expression novelty" of the probe's ORF. Table 4 is sorted 
in descending order based on this measure, reported as 

20 "Most Similar (top) Hit BLAST E Value". Those sequences for 
which no M Hit E Value" is listed are those exons which were 
found to have no similar sequences. 

As sorted, Table 4 thus lists its respective 
probes (by "AMPLICON SEQ ID NO.:" and additionally by the 

25 SEQ ID NO:, of the exon contained within the probe: "EXON 
SEQ ID NO.:") from least similar to sequences known to be 
expressed (i.e., highest BLAST E value), at the beginning 
of the table, to most similar to sequences known to be 
expressed (i.e., lowest BLAST E value), at the bottom of 

30 the table. 

Table 4 further provides, for each listed probe, 
the accession number of the database sequence that yielded 
the "Most Similar (top) Hit BLAST E Value", along with the 
name of the database in which the database sequence is • 
35 found ("Top Hit Database Source") . 
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Table 4 further provides SEQ ID NOS . 
corresponding to the predicted amino acid sequences where 
they have been determined for the probe and exon nucleotide 
sequences. These are set out as PEPTIDE SEQ ID NOS.:. The 
5 peptide sequences for a given exon are predicted as 

follows: Since each chip exon is a consensus sequence drawn 
from predictions from various exon finding programs (i.e. 
Grail, GeneFinder and GenScan) , the multiple initial ORFs 
are first determined in a uniform way according to each 

10 prediction. In particular, the reading frame for predicting 
the first amino acid in the peptide sequence always starts 
with the first base of any codon and ends with the last 
base of non-termination codon. Next, for each strand of the 
exon, initial ORFs are merged into one or more final ORFs 

15 in an exhaustive process based on the following criteria: 
1) the merging ORFs must be overlapping, and 2) the merging 
ORFs must be in the same frame. 

The Sequence Listing, which is a superset of all 
of the data presented in Table 4, further includes, for 

20 each probe, the most similar hit, with accession number and 
BLAST E value, from the each of the three queried 
databases . 

Table 4 further lists, for each probe, a portion 
of the descriptor for the top hit ("Top Hit Descriptor") as 

25 provided in the sequence database. For those ORFs that are 
similar in sequence, but nonidentical to known sequences 
(e.g., those with BLAST E values between about le-05 and 
le-100) , the descriptor reveals the likely function of the 
protein encoded by the probe's ORF. 

30 Using BLAST E value cutoffs of le-05 (i.e., 1 x 

10" 5 ) and le-100 (i.e., 1 x 10" 100 ) as evidence of similarity 
to sequences known to be expressed is of course arbitrary: 
in Example 2, supra, a BLAST E value of le-30 was used as 
the boundary when only two classes were to be defined for 

35 analysis (unknown, >le-30; known <le-30) (see also FIG. 8) . 
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Furthermore, even when the "Most Similar (Top) Hit BLAST E 
Value" is low, e.g., less than about le-100 — which is 
probative evidence that the query sequence has previously- 
been shown to be expressed — the top hit is highly unlikely 
5 exactly to match the probe sequence. 

First, such expression entries typically will not 
have the intronic and/or intergenic sequence present within 
the single exon probes listed in the Table. Second, even 
the ORF itself is unlikely in such cases to be present 

10 identically in the databases, since most of the EST and 
mRNA clones in existing databases include multiple exons, 
without any indication of the location of exon boundaries. 

As noted, the data presented in Table 4 represent 
a proper subset of the data present within the attached 

15 sequence listing. For each amplicon probe (SEQ ID NOs . : 1 
- 9,980) and probe exon (SEQ ID NOs.: 9,981 - 19,771, 
respectively), the sequence listing further provides, 
through iterated annotation fields <220> and <223>: 

, . (a) the accession number of the BAC from which 

20 the sequence was derived ("MAP TO"), thus providing a link 
to the chromosomal map location and other information about 
the genomic milieu of the probe sequence; 

(b) the most similar sequence provided by BLAST 
query of the EST database, with accession number' and BLAST 

25 E value for the "hit"; 

(c) the most similar sequence provided by BLAST 
query of the GenBank NR database, with accession number and 
BLAST E value for the "hit"; and 

(d) the most similar sequence provided by BLASTX 
30 query of the SWISSPROT database, with accession number and 

BLAST E value for the "hit". 



EXAMPLE 5 

35 Genome-Derived Single Exon Probes Useful For Measuring 
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Expression of Genes in Human Heart 

Table 4 (413 pages) presents expression, homology, and 
functional information for the genome-derived single exon 
5 probes that are expressed significantly in human heart. 
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Top Hit Descriptor 




































|Homo sapiens LSS gene, partial, exons 15, 16, 17 and 18 | 


| Haemophilus influenzae Rd section 31 of 163 of the complete genome ■ 


IMus musculus Naip3 gene, exon 1; neuronal apoptosis inhibitory protein 1 (Naipl) and general transcription 1 
factor IIH polypeptide 2 (Gtf2h2) genes, complete cds | 


jUus musculus Naip3 gene, exon 1; neuronal apoptosis inhibitory protein 1 (Naipl) and general transcription 1 
factor IIH polypeptide 2 (Gtf2h2) genes, completecds j 


J Dengue virus type 3 membrane protein (prM/M)/envelope glycoprotein (E) polyprotein mRNA, partial cds | 


| Dengue virus type 3 membrane protein (pnWM)/envelope glycoprotein (E) polyprotein mRNA, partial cds | 


1 Mus musculus AT3 gene for antithrombin, complete cds I 


| IMMEDIATE-EARLY PROTEIN 1 (IE1) (IMMEDIATE-EARLY PHOSPHOPROTEIN PP89) | 


jLeuciscus cephalus orientalis cytochrome b (cyt b) gene, partial cds; mitochondrial gene for mitochondrial 1 
|product 1 


I Leuciscus cephalus orientalis cytochrome b (cyt b) gene, partial cds; mitochondrial gene for mitochondrial j 
product j 


i 
1 
D 

I 
X 
Z 


Top Hit 
Database 




































!nt ! 


INT I 


k 


NT 


NT 


z 


NT J 


O 
K 


|nt 


U 


|EST_HUMAN | 


Top Hit Acession 
No. 




































|AJ239028.1 j 


iU32716.1 j 


|aF242432.1 


|aF242432.1 




|l11433.1 i 


§ 






i 


i 


Most Similar 
(Top) Hit 
BLAST E 
Value 




































| 9.9E+00| 


j 9.8E+00| 


1 


9.6E+Oo| 




| 9.4E+Oo| 


1 S.4E+00I 


| 9.3E + 00| 




I 9.1E+00I 


1 8.9E+00| 


Expression 
Signal 












5 




5 






2.8A\ 








2.19| 






' 13.31 1 








l 2.72I 






1 2.97| 


Jl 


2.07 




ORF SEQ 

ID NO: 


| 25700 [ 




| 26404 | 


1 26405| 


s 

8 




j 27569 | 






1 28355| 




1 


281 50 1 


8 






25230 | 




26745 | 


2804e| 


28047 | 


1 


§ 


1 226121 


| 27271 | 


[ 24932J 






Exon 
SEQ ID 
NO: 










| 16577[ 


| 


8 


! 175341 


1 


! 19472J 


I 




17906| 


I 


1 


I 








1 


I78O5J 


1 


1 


1 128201 


1 170831 


I 151631 




| 15587| 


Probe 
SEQ ID 
NO: 


I 


1 


s 

8 


a 
s 


1 


I 


| 7494| 




| 8070 1 


1 


1 


I 


| 8757 [ 


I 8757I 


1 


i 




I 


1 




I 7955I 


I 2631 I 


1 


1 


1 7206| 






s 
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%9 
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Top Hit Descriptor 


LYCOPENE BETA CYCLASE | 


PNEUMOLYSIN (THIOL-ACTIVATED CYTOLYSIN) | 


Arabidopsis thaliana DNA chromosome 4, contig fragment No. 67 


VITELLOGENIN PRECURSOR (VTG) [CONTAINS: LIPOVITELLIN LV-1N; LIPOVITELLIN LV-1C; j 
LIPOVITELLIN LV-2] | 


RHODOPSIN i 


RHODOPSIN | 


Bovine immunodeficiency-like virus surface envelope gene, 5' end of cds | 


DNA POLYMERASE GAMMA (MITOCHONDRIAL DNA POLYMERASE CATALYTIC SUBUNIT) | 


PROBABLE ANTIBACTERIAL PEPTIDE POLYPROTEIN PRECURSOR | 


COLICIN N IMMUNITY PROTEIN (MICROCIN N IMMUNITY PROTEIN) | 


Canis familiaris skeletal muscle chloride channel CIC-1 (CLCN1) mRNA, completecds | 


Mycobacterium tuberculosis H37Rv complete genome; segment 1 03/1 62 | 


Eunice australis histone H3 (H3) gene, partial cds j 


PM0-BT0547-31 01 00-002-b04 BT0547 Homo sapiens cDNA i 


601875654F1 NIH_MGC_55 Homo sapiens cDNA clone IMAGE:4099716 5' | 


601875654F1 NIH_MGC_55 Homo sapiens cDNA clone IMAGE:4099716 5' I 


Homo sapiens chromosome 21 segment HS21C08D | 


Methanococcus jannaschii section 111 of 1 50 of the complete genome j 


Archaeoglobus fulgidus section 63 of 172 of the complete genome j 


602072585F1 NCI_CGAP_Brn67 Homo sapiens cDNA clone IMAGE:4215284 5' | 


602072585F1 NCI_CGAP_Bm67 Homo sapiens oDNA clone IMAGE:421 5284 5' | 


Murine 1 gene for MHC class ll(la) associated invariant chain | 


Plasmodium falciparum R29R+var1 gene, exon 1 I 


Homo sapiens glutathione S-transferase theta 2 (GSTT2) and glutathione S-transferase theta 1 (GSTT1) I 
|genes, completecds | 


MICROSOMAL DIPEPTIDASE PRECURSOR (MDP) (DEHYDROPEPTIDASE-I) (RENAL DIPEPTIDASE) I 
(RDP) | 


EXTENSIN PRECURSOR (CELL WALL HYDROXYPROLINE-RICH GLYCOPROTEIN) | 


EXTENSIN PRECURSOR (CELL WALL HYDROXYPROLINE-RICH GLYCOPROTEIN) | 


wf67g03 xl Soares_NFL_T_GBC_S1 Homo sapiens cDNA clone IMAGE:2360692 3' | 


YY1 PROTEIN PRECURSOR | 


GENE 68 PROTEIN | 


GENE 68 PROTEIN | 


Top Hit 

Database 


SWISSPROT 


SWISSPROT j 


z 


(sWISSPROT | 


SWISSPROT | 


SWISSPROT i 


z 


SWISSPROT 


SWISSPROT ! 


SWISSPROT j 


Z 


z 


z 


1 


EST HUMAN j 


EST HUMAN 


z 


z 


z 


z 

3 


EST HUMAN | 


z 


z 


z 


IsWISSPROT i 


SWISSPROT 1 


SWISSPROT 


EST HUMAN 


SWISSPROT 


SWISSPROT 


O 
1 


Top HitAcession 
No. 


1 
a 




AL161571.2 | 


|q91062 l 


| 

5 


1 

5 


L43126.1 ! 


P54098 


Q27905 | 


I 


I 


Z83860.1 






| 


BF24D552.1 


1 


I 


AE001 044.1 


1 

1 


i 
1 


X13414.1 I 


Y13402.1 | 


I 

< 


|pi6444 | 


1 


1 


1 
< 


023810 




2 

! 


Most Similar 
(Top) Hit 
BLAST E 
Value 


5.6E+00| 


5.5E+00| 


5.5E+00| 


I 5.4E-i-Oo! 


5.4E+00| 


5.4E+0OI 


5.3E+00| 


5.3E+00| 


5.3E+00I 


1 


5.0E+00I 


5.0E+00| 


i 


4.8E+00I 


4.7E+00j 


4.7E+00| 


4.7E+00| 




4.5E+00| 


4.4E+00| 


4.4E+00I 


4.4E+00I 


4.3E+00] 


I 4.3E+Oo| 


I 4.2E+Oo| 


4.2E+00| 


4.2E+00' 


4.2E+00| 


4.1E+00j 


4.1E+00I 


4.1E+001 


Expression 


2M\ 












a 


S 








10.53| 


i 






a. 








1 


1 


ffi. 






1 


C? 




S 


7.65[ 
















ORFSEQ 
ID NO: 










27849| 


278501 


s 




290031 


8 


28050! 








200711 


20071 | 


229581 


1 


1 


B 

SI 


227241 




I 








261 87| 


27249[ 


8 
S 


1 

8 




SEQID 

NO: 


7919 




8773 








1 


8 




1 


1 


8477 




1 


0251 


0251 


8 


I 


1 


1 


1 




6259 


18116 


15306| 


6043 




1 




6435! 


6435! 
































































Probe 
SEQID 
NO: 


8770- 


1 


ss 

3 


I 




7769| 


4675| 




1 




R 


1 


s 


1 


1 


s 


32361 


1 




3003) 


§ 


5764 i 


§ 


I 8235 




S 
8 




7182| 


65171 


8 


6577 



105 



WO 01/57274 



PCT/US01/00666 



Top Hit Descriptor 


Pan troglodytes novel repetitive solo LTR element in the RNU2 locus j 


602247938F1 NIH_MGC_62 Homo sapiens cDN A clone IMAGE:4333209 5' i 


HYPOTHETICAL PROTEIN HVLF1 j 


601507510F1 NIH_MGC 71 Homo sapiens cDNA clone IMAGE:3909051 5' 


INSULIN-LIKE GROWTH FACTOR BINDING PROTEIN 1 PRECURSOR (IGFBP-1)(IBP-1) (IGF- 
BINDING PROTEIN 1) | 


CELL DIVISION PROTEIN FTSY HOMOLOG j 


CYTOCHROME C OXIDASE POLYPEPTIDE III i 


GENOME POLYPROTEIN [CONTAINS: CAPSID PROTEIN C (CORE PROTEIN); MATRIX PROTEIN 
(ENVELOPE GLYCOPROTEIN M); MAJOR ENVELOPE PROTEIN E; NONSTRUCTURAL PROTEINS 
NS1, NS2A, NS2B, NS4A AND NS4B; HELICASE (NS3); RNA-DIRECTED RNA POLYMERASE (NS5)] 


GENOME POLYPROTEIN [CONTAINS: CAPSID PROTEIN C (CORE PROTEIN); MATRIX PROTEIN 
(ENVELOPE GLYCOPROTEIN M); MAJOR ENVELOPE PROTEIN E; NONSTRUCTURAL PROTEINS 
NS1 , NS2A, NS2B, NS4A AND NS4B; HELICASE (NS3); RNA-DIRECTED RNA POLYMERASE (NS5)] 


N.tabacum chitinase gene 50 for class 1 chitinase C | 


Mus musculus seminal vesicle secretory protein 99 (MSVSP99) gene, promoter region | 


MRO-BN0070-300500-028-h05 BN0070 Homo sapiens cDNA f 


MRO-BN0070-300500-028-h05 BN0070 Homo sapiens cDNA I 


HYPOTHETICAL TRANSCRIPTIONAL REGULATOR IN AIDB-RPSF INTERGENIC REGION j 


Human MHC class II lymphocyte antigen (DPw4-beta-1) gene, exon 2 j 


X.laevis mRNA for M4 muscarinic receptor | 


Homo sapiens NF2 gene 1 


nr18a12.s1 NCI CGAP Ew1 Homo sapiens cDNA clone IMAGE:1 168318 similar to gb:A10416 j 
METALLOPROTEINASE INHIBITOR 1 PRECURSOR (HUMAN); | 


Helicobacter pylori, strain J99 section 123 of 132 of the complete genome j 


HUMSUPY135 Human brain cDNA Homo sapiens cDNA clone 148 \ 


Arabidopsis thaliana DNA chromosome 4, contig fragment No. 39 j 


302120551 F1 NIH MGC 56 Homo sapiens cDNA clone 1MAGE:4277748 5' I 


602120551 F1 NIH_MGC_56 Homo sapiens cDNA clone IMAGE:4277748 5' j 


AV761 055 MDS Homo sapiens cDNA clone MDSBUE1 0 5' | 


Pseudomonas aeruginosa PA01 , section 8 of 529 of the complete genome j 


Pseudomonas aeruginosa PA01 , section 8 of 529 of the complete genome 


Top Hit 
Database 


z 


5 

Z: 
I 

EE 


SWISSPROT 


EST HUMAN 


IsWISSPROT 


SWISSPROT 


SWISSPROT 


SWISSPROT 


SWISSPROT 




z 


EST HUMAN | 


X 


SWISSPROT j 


z 






EST HUMAN | 




EST HUMAN | 


z 


I 


z 

I 


EST_HUMAN | 


z 


z 


Top Hit Acession 
No. 


U57503.1 t 


BF692425.1 


P09716 


1 


|p47876 


033010 | 


P14546 


P07564 


P07564 


1 


AF055466.1 I 


BE814357.1 | 


BE814357.1 | 


f 


M23907.1 | 


1 


§ 


AA661 489.1 | 


AE001 562.1 j 


D44725.1 j 


AL1 61 539.2 ! 


3F669279.1 I 




AV761 055.1 [ 


AE004447.1 | 


AE004447.1 j 


Most Similar 
(Top) Hit 
BLAST E 
Value 


4.1E+00 






4.1E+00 


oo+at't I 


1 


4.0E+00 






oo+36-e j 


3.9E+00| 




1 


? 


3.9E+00| 


3.9E+00| 


3.9E+00| 


3.9E+Oo| 






3.7E+00| 


3.7E+00| 


3.7E+00| 


3.6E+00| 


3.6E+00| 


3.6E+00| 


Expression 
Signal 






2.89 


12.46| 






2.17| 






3.89 1 












2.32| 




5.58 1 






10.09 1 


2.13| 










ORFSEQ 
ID NO: 


3 


27641 






25241 


26285| 


s 
s 


28934 




23181 | 






25455 | 


26140 | 




1 








I 




28872| 




20322| 




I 


SEQ ID 
NO: 


1 


17427 


18139 


18216 


19231 


16131, 


1 














1 


i 








2459 




1 






0516 










































Probe 
SEQ ID 
NO: 


1 


| 7576 


1 


| 8339| 


l 9672 


I 


8 


1 


1 


8 
S 


4226 1 




5471| 


s 


in 
8 


! 


1 


| 871 s| 


1 


1 


1 


8730 1 




6 


7022| 


7022| 



106 



WO 01/57274 



PCT/US01/00666 





I 
I 
2 
\ 


Escherichia coli glycerophosphate dehydrogenase (glpD) gene, partial cds; and the translation start site has 
been verified (glpE), the translation start site has been verified (glpG), and repressor protein (glpR) genes, 
complete cds 


Cryptosporidium felis heat shock protein 70 (HSP70) gene, partial cds | 


Brassica napus RPB5d mRNA, complete cds J 


Homo sapiens chromosome 21 segment HS21 C078 j 


DNA-DIRECTED RNA POLYMERASE II LARGEST SUBUNIT | 


Saccharomyces cerevisiae MSS1 gene, complete cds | 


Homo sapiens DiGeorge syndrome critical region, centromeric end j 


Homo sapiens KIAA0480 gene product (KIAA0480), mRNA | 


Homo sapiens KIAA0480 gene product (KIAA0480), mRNA | 


I 
O 
*? 
o 
I 


D.rerio zp-50 POU gene | 


PHOSPHOGLYCERATE KINASE, CYTOSOLIC | 


PHOSPHOGLYCERATE KINASE, CYTOSOLIC | 


NADH-UBIQUINONE OXIDOREDUCTASE CHAIN 4 | 


NADH-UBIQUINONE OXIDOREDUCTASE CHAIN 4 1 


Chlamydomonas reinhardtii chloroplast DNA for rps9, ycf4, ycf3, rps18 genes | 


Chlamydomonas reinhardtii chloroplast DNA for rps9, ycf4, yof3, rps18 genes | 


PERIPLASMS [NIFE] HYDROGENASE SMALL SUBUNIT (NIFE HYDROGENLYASE SMALL CHAIN) | 


S.cerevisiae threonine deaminase (ILV1 ) gene, complete cds j 


Oryzias latipes OIGC6 gene for guanylyl cyclase C, complete cds I 


Sus scrofa choline acetyltransferase gene, promoter region | 


HYPOTHETICAL 142.5 KD PROTEIN C23E2 02 IN CHROMOSOME I ) 


TYPE 1 lODOTHYRONlNE DEIODINASE (TYPE-I 5'DEIODINASE) (DIOI) (TYPE 1 Dl) (5DI) j 


TYPE I lODOTHYRONlNE DEIODINASE (TYPE-I 5'DEIODINASE) (DIOI) (TYPE 1 Dl) (5DI) | 


GLUTAMATE [NMDA] RECEPTOR SUBUNIT EPSILON 3 PRECURSOR (N-METHYL D-ASPARTATE 1 
RECEPTOR SUBTYPE 2C) (NR2C) (NMDAR2C) | 


DEOXYHYPUSINE SYNTHASE (DHS) j 


GENOME POLYPROTEIN [CONTAINS: CAPSID PROTEIN C (CORE PROTEIN); MATRIX PROTEIN 
(ENVELOPE PROTEIN M); MAJOR ENVELOPE PROTEIN E; NONSTRUCTURAL PROTEINS NS1, 
NS2A, NS2B, NS4A AND NS4B; HELICASE (NS3); RNA-DIRECTED RNA POLYMERASE (NS5)] 


retinoic acid nuclear receptor isoform beta 2 [mice, embryonal carcinoma cell line, PCC7-MZ1 , mRNA, 2971 I 
nt] I 


Top Hit 


Source 


NT 


NT | 




NT | 


SWISSPROT | 


NT I 


NT I 


NT | 


z 


NT | 


NT | 


SWISSPROT | 


SWISSPROT | 


SWISSPROT j 


SWISSPROT [ 


NT | 


z 


SWISSPROT j 


z 


NT | 


NT | 


SWISSPROT | 


SWISSPROT j 


SWISSPROT [ 


SWISSPROT | 


SWISSPROT [ 


SWISSPROT 




Top HitAcession 


No. 


2 


AF221 538.1 ! 


AF254577.1 | 




P04052 i 


AF013167.1 i 


L77570.1 


7662155| 


I 


1 


X96422.1 | 


P12783 i 


P12783 I 


P18931 i 


1 


Y1 3655.1 | 


Y13655.1 I 




M36383.1 ! 


AB016081.2 | 


L33836.1 ! 


Q10135 


S 

I 


1 


Q14957 | 


1 


1 


i 


Most Similar 
(Tod) Hit 


BLAST E 
Value 


3.6E+00 


3.5E+00| 


3.4E+00| 


? 

s 


3.4E+00| 


3.4E+00| 


3.4E+00| 


3.3E+00| 


3.3E+00| 


3.2E+00| 


• 3.2E+00| 


3.2E+00| 


3.2E+00| 


3.2E+00| 


§ 


3.2E+00| 


3.2E+00| 


3.2E+00| 


3.2E+00| 


3.2E + 00| 


3.2E+00| 


3.1E+00| 


3.1E+00| 


3.1E+00| 


3.1E+Oo| 


3.1E+00| 


3.1E+00 


3.1E+00I 


1 

1 


Signal 




1.04| 










1.92| 






1.39| 








2.06| 




2.35| 


2.35| 


CO 








2.42| 


4.35| 










3.35 


ORFSEQ 


ID NO: 






s 




I 


i 


% 


S 


1 


% 

8 


20248| 


1 




25825j 






26610 1 




27567| 


1 




25574j 


27067| 


1 




8 






1 


:ON 




3133 


1 


2406 


621 7i 




8631 


1 


1 




1 


1 


5359 [ 


5713 


5713 


6427 


6427I 


I 






1 




6877 




7238 


7770 


1 


1 


























































Probe 
SEQ ID 
NO: 


1 


3209} 


1 


2532 1 


8 




§ 






9 


1 


I 


s 


1 




I 




1 


7491| 


. 7837| 


8 


1 


7000 1 


7000 [ 


7334| 




8 
ES 


8774) 
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4 


(RETGC-2) 
VTE CYCLASE 


































i 
































z 
























complete 














Top Hit Descriptor 


Homo sapiens hypothetical protein PRO08S9 (PRO0889), mRNA 


S.aureus genes encoding Sau96l DNA methyitransferase and Sau96l restriction endonucleasi 


CYR61 PROTEIN PRECURSOR (3CH51) 


B.napus DNA for myrosinase 


RETINAL GUANYLYL CYCLASE 2 PRECURSOR (GUANYLATE CYCLASE 2F, RETINAL) 
(ROD OUTER SEGMENT MEMBRANE GUANYLATE CYCLASE 2) (ROS-GC2) (GUANYL/ 
jF)(GC-F) 


RETINAL GUANYLYL CYCLASE 2 PRECURSOR (GUANYLATE CYCLASE 2F, RETINAL) 
(ROD OUTER SEGMENT MEMBRANE GUANYLATE CYCLASE 2) (ROS-GC2) (GUANYL 

| F) (GC-F) 


Chlamydophila pneumoniae AR39, section 53 of 94 of the complete genome 


F.pringlei gdcsPA gene for P-protein of the glycine cleavage system 


BRAIN-SPECIFIC ANGIOGENESIS INHIBITOR 1 PRECURSOR 


BRAIN-SPECIFIC ANGIOGENESIS INHIBITOR 1 PRECURSOR 


ADHERENCE FACTOR (ADHESION AND AGGREGATION MEDIATING SURFACE ANTK 


Buxus harlandii maturase K (matK) gene, partial cds; chloroplast gene for chloroplast product 


Arabidopsis thaliana DNA chromosome 4, contig fragment No. 52 


Mus musculus endomucin (LOC53423), mRNA 


Mus musculus per-hexamer repeat gene 3 (Phxr3), mRNA 


Mus musculus per-hexamer repeat gene 3 (Phxr3), mRNA 


Homo sapiens apoA polymorphism Kringle IV gene, exons 1 and 2 


Botrytis cinerea strain T4 cDNA library under conditions of nitrogen deprivation 


CM0-BT0281-031199-087-h04 BT0281 Homo sapiens cDNA 


Mus musculus sphingosine kinase (SPHK1 b) mRNA, complete cds 


Mus musculus SRY-box containing gene 13 (Sox13), mRNA 


Mus musculus SRY-box containing gene 13 (Sox13), mRNA 


'Mus musculus SH2-containing inositol 5-phosphatase (Ship) gene, exons 16 through 27, and 


faba bean necrotic yellows virus C2-Eg gene, isolate Egyptian EV1-93 


faba bean necrotic yellows virus C2-Eg gene, isolate Egyptian EV1-93 


Arabidopsis thaliana DNA chromosome 4, contig fragment No. 40 


Mus musculus cleavage and polyadenylation specificityfactor 3 (Cpsf3), mRNA 


Homo sapiens ATP-binding cassette, sub-family B (MDR/TAP), member 4 (ABCB4), mRNA 


Aspergillus nidulans recQ gene for DNA helicase, exons 1-4 








>ROT | 




3 ROT 


I 






'ROT I 


'ROT | 


'ROT | 
















I 
































i 








I 




I 








































z 


Z 










z 






5 




z 




z 






z 








z 










z 








1 


1 


























3724 I 


1 


I 












1 










s 
g 


1 




Top Hit Aces 
No. 


1 




P18406 


X67838.1 


I 


P51842 




1 


014514 


5 


§ 
I 


s 


AL1 61 552.2 


1 


ss 


s 


L1 4005.1 


i 




AF068749.1 




1 


AF235502.1 


AJ132180.1 


i 

3 


AL161 540.2 


1141 


AJ271 844.1 


Most Similar 
(Top) Hit 
BLAST E 
Value 


3.0E+00| 


3.0E+00| 




3.0E+00| 


3.0E+00 


3.0E+00 




2.9E+00| 


2.9E+00I 


2.9E+00| 


2.9E+00I 




2.8E+00| 






2.7E+00I 


2.7E+00| 




2.7E+00I 






2.6E+00| 


2.6E+Oo| 


2.6E+00| 




■ 2.6E+00| 


2.6E+00| 


§ 


2.5E+00| 


r 


P 


s 




" 




6.51 




s 






5.191 








1 




1,75 






4.35 




1.97| 


J 






2.95| 


1.52| 




2.08| 


ORFSEQ 
ID NO: 


22535 1 


24970 1 






28501 


28502 




a 


26300 I 




$ 


21212| 






20012| 


1 


25377I 








25373 I 


253741 




1 


1 


27634| 








SEQID 
NO: 


12737 


15195| 




170411 


18251 


18251 


1 




161461 


I 


162711 


11346| 




8 


1 




1 


17062| 


I 


s 




1 


16391 j 


16576 


16576| 


17418| 


17777! 


1 


8 


Probe 
SEQID 
NO: 


2807 


5273 


S 


7164| 


it 
8 


8374 


1 




62821 


I 


64101 


i 




6325| 


8 


1 


54081 


7185j 


1 


4576 


5405 1 


5405| 


I 6533 


I 


1 


7567 


7927 


1 


1448 
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Top Hit Descriptor 


1 Aspergillus nidulans recQ gene for DNA helicase, exons 1-4 I 


jTEICHOIC ACID BIOSYNTHESIS PROTEIN F | 


|TEICHOICACID BIOSYNTHESIS PROTEIN F | 


|TEICHOIC ACID BIOSYNTHESIS PROTEIN F | 


|TEICHOIC ACID BIOSYNTHESIS PROTEIN F | 


| Q V4-FT0005-1 1 0500-205-g07 FT0005 Homo sapiens cDNA | 


Rice DNA for aldolase C-1 , complete cds j 


|Mus musculus EIF4H gene, partial cds; LIMK1 gene, complete cds; and ELN gene, partial cds | 


\ Chicken alpha-3 collagen type VI mRNA, 3" end | 


Homo sapiens double C2-like domains, alpha (DOC2A) mRNA j 


jVITELLOGENIN 1 PRECURSOR (YOLK PROTEIN 1) | 


I CD27L RECEPTOR PRECURSOR (T-CELL ACTIVATION ANTIGEN CD27) (T14) | 


j CD27L RECEPTOR PRECURSOR (T-CELL ACTIVATION ANTIGEN CD27) (T14) j 


| Helicobacter pylori, strain J99 section 47 of 1 32 of the complete genome j 


1 RG2-PT0004-031 299-0 1 1 -d05 PT0004 Homo sapiens cDNA 1 


ENDOCHITINASE B PRECURSOR (CHN-B) | 


jSKIN GRANULE PROTEIN PRECURSOR | 


|SKIN GRANULE PROTEIN PRECURSOR | 


H :sapiens CTGF gene and promoter region j 


[XYLULOSE KINASE (XYLULOKINASE) | 


1 

| 

1 
1 

s 

1 

I 

2 
% 

o 

o 
z 


jhr63f06.x1 NCI_CGAP_Kid11 Homo sapiens cDNA clone IMAGE:3133187 3' j 


Bacillus subtilis chromosomal DNA, region 75 degrees: glpPFKD operon and downstream ! 


Fragaria x ananassa cytosolic ascorbate peroxidase (ApxSC) gene, ApxSC-c allele, complete cds | 


G.domesticus artificial single chain antibody gene (L3) | 


i Bos taurus partial cytb gene for cytochrome b j 


Rattus norvegicus ATPase, Ca++ transporting, ubiquitous (Atp2a3), mRNA | 


| MAJOR CENTROMERE AUTOANTIGEN B (CENTROMERE PROTEIN B) (CENP-B) | 


M.mazei dnaK and dnaj genes homologues coding for DnaK and DnaJ j 


j ALPHA-(1 ,3)-FUCOSYLTRANSFERASE (GALACTOSIDE 3-L-FUCOSYLTRANSFERASE) I 
(FUCOSYLTRANSFERASE 4) (FUCT-IV) | 


1602069121 F1 NIH_MGC_58 Homo sapiens cDNA clone IMAGE:40681 73 5' | 


1602069121 F1 NIH_MGC_58 Homo sapiens cDNA clone IMAGE:4068173 5' I 


Top Hit 

Database 


z 


ISWlSSPROT ! 


'SWISSPROT | 


SWISSPROT j 


iSWlSSPROT 


EST HUMAN j 


z 


z 


z 


z 


SWISSPROT | 


SWISSPROT | 


O 

I 


z 


EST HUMAN 1 


SWISSPROT I 


SWISSPROT [ 


SWISSPROT | 


z 


SWISSPROT j 


I 

UJ 


EST HUMAN [ 


z 


z 


z 


Z 


z 


SWISSPROT | 


z 


SWISSPROT | 


EST HUMAN | 


EST HUMAN | 


TopHitAcession 
No. 


1271844.1 I 


1 3485 I 


I3485 | 


1 


13485 | 


jAW949158.1 | 


JD50307.1 | 


JAF289665.1 ] 


5 


4503352 [ 


JP02843 | 


IP26842 | 




i 


IAW875126.1 I 


;P24091 | 


|P13673 j 


P13673 | 


1X92511.1 | 


1 

£ 


1 

1 


;BE326702.1 j 


IY14079.1 ) 


AF1 58652.2 \ 


IZ46724.1 | 


§ 

5 


1 


|P07199 S 


i 


Q11127 ! 


BF541987.1 ] 


BF541987.1 | 


< 










Most Similar 
(Top) Hit 
BLAST E 
Value 






I 




| 2.5E+00| 


| 2.5E+00| 


| 2.5E+00| 




1 


1 


| 2.4E+00! 


1 


f 


| 2.4E+00| 


I 2.4E+00) 


j 2.4E+00) 


| 2.4E+00| 




| 2.4E+00| 


? 


| 2.4E+00J 




j 2.4E+00| 


| 2.4E+Oo| 


| 2.3E+00| 


| 




| 2.3E+00| 


§ 


| 2.3E+Oo| 


| 2.3E+00| 


! 2.3E+00| 


Expression 






8 






s 






s. 


8. 






N 


s 




















s 


«. 
















ORF SEQ 
ID NO: 




| 25537| 


| 25538| 


| 25537| 


| 25538| 


1 








1 


1 

8 


| 26802| 


j 26803 1 






I 271941 


| 27853 1 


! 27854[ 


| 27911 | 




! 280021 




| 28558| 


3 


| 20992| 




j 26431 | 




1 


27364| 


8 

3 


S 


SEQ ID 
NO: 


11353 


8 


i 


s 


S 


1 




1 


1 


1 


i5569| 


166131 


1 




I 


17002 










17764 


i 


1 






13934 






1 


171 65) 






































































Probe 
SEQ ID 
NO: 


I 1448 


S 


1 


I 


$ 


1 


7264 




I 


1 


| 5657| 


1 


| 6733 | 


1 


I 


1 7125| 






| 7820 | 


§ 


| 7914| 


| 7914| 




I 


1 


1 




I 


§ 


1 


1 


1 
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Top Hit Descriptor 


601433673F1 NlH_MGC_72 Homo sapiens cDNA clone IMAGE:3918643 5' | 


Neurospora crassa G protein alpha subunit GNA-3 (gna-3) gene, complete cds j 


Rat gene for regucalcin, exonl (non-coding exon) j 


Rat gene for regucalcin, exonl (non-coding exon) j 


SORTILIN-RELATED RECEPTOR PRECURSOR (SORTING PROTEIN-RELATED RECEPTOR 
CONTAINING LDLR CLASS A REPEATS) (MSORLA) (SORLA-1 ) (LOW-DENSITY LIPOPROTEIN 
! RECEPTOR RELATIVE WITH 11 LIGAND-BINDING REPEATS) (LDLR RELATIVE WITH 1 1 LIGAND- 
BINDING REPEATS) (LR1 1 ) (> 


SORTILIN-RELATED RECEPTOR PRECURSOR (SORTING PROTEIN-RELATED RECEPTOR 
CONTAINING LDLR CLASS A REPEATS) (MSORLA) (SORLA-1) (LOW-DENSITY LIPOPROTEIN 
i RECEPTOR RELATIVE WITH 11 LIGAND-BINDING REPEATS) (LDLR RELATIVE WITH 11 LIGAND- 
BINDING REPEATS) (LR11) (> 


600943401T1 NIH_MGC_17 Homo sapiens cDNA clone IMAGE:2959777 3' | 


MINOR VIRION STRUCTURAL PROTEIN MU-2 j 


INSULIN-LIKE GROWTH FACTOR II PRECURSOR (IGF-II) (SOMATOMEDIN A) | 


n!95b02.s1 NC!_CGAP_Co10 Homo sapiens cDNA clone IMAGE:105B379 3' | 


1 
f. 

1 

1 

< 
Z 
Q 

1 
I 

! 
1 

I 

l 


601594733F1 NIH_MGC_9 Homo sapiens cDNA clone IMAGE:3948561 S j 


TRANSPOSON TY1 PROTEIN A j 


qm69b03.x1 Soares_placenta_8to9weeks_2NbHP8to9W Homo sapiens cDNA clone IMAGE.1 893965 3' I 
similar to gb:Y00433 GLUTATHIONE PEROXIDASE (HUMAN); | 


qm69b03.x1 Soares_placenta 8to9weeks_2NbHP8to9W Homo sapiens cDNA clone IMAGE:1 893965 3' I 
similar to gb;Y00433 GLUTATHIONE PEROXIDASE (HUMAN); . | 


501855591 F1 NIH_MGC_57 Homo sapiens cDNA clone IMAGE:4075391 5' j 


Homo sapiens ovarian granulosa cell 13.0 kDa protein hGR74 homolog mRNA, complete cds | 


UROMODULIN PRECURSOR (TAMM-HORSFALL URINARY GLYCOPROTEIN) (THP) j 


O 

< 
< 


Mus musculus pre-T cell receptor alpha gene, enhancer region and upstream region \ 


1 

I 
I 

i 
1 

I 

s 

< 
o 
o 

o 

1 
1 

i 
5 


ALPHA-2-HS-GLYCOPROTEIN PRECURSOR (FETUIN-A) | 


yy08a10.s1 Soares melanocyte 2NbHM Homo sapiens cDNA clone IMAGE:270618 3' similar to gb:M55654 I 
TRANSCRIPTION INITIATION FACTOR TFIID (HUMAN); | 


AU 1 23530 NT2RM2 Homo sapiens cDNA clone NT2RV2000571 5' j 


Homo sapiens p22Dokdel (DOKDEL) mRNA, complete cds | 


Top Hit 
Database 


EST HUMAN | 


z 






SWISSPROT 


SWISSPROT 


EST HUMAN j 


SWISSPROT [ 


SWISSPROT 1 


EST HUMAN | 


EST HUMAN j 


EST HUMAN [ 


l 
1 

m 


|eST HUMAN | 


I EST HUMAN I 


EST HUMAN j 


z 


SWISSPROT | 


SWISSPROT j 




I 

i 


SWISSPROT j 


|eST HUMAN | 


EST HUMAN | 


5 


Top HitAcession 
No. 


BE895237.1 j 


AF281 862.1 | 


067071.1 


D67071.1 


088307 


O 


l 


1 


P51459 | 


AA594574.1 


i 


BE741 678.1 


Q04706 


|aI290373.1 


Ll290373.1 


BF246782.1 


AF183416.1 


P07911 


| 


1 


I 
1 


070159 


1 


AU123630.1 


AF180527.1 


Most Similar 
(Top) Hit 
BLAST E 
Value 


2.3E+00J 


<N" 


2.2E+00| 


2.2E+00| 


2.2E+00 


cl 


2.2E+00I 


2.2E+00| 




2.2E+00j 


2.2E+00| 


2.2E+00I 


2.2E+0C| 


I 2.2E+00 




2.2E+00] 


2.2E+00 


2.2E+00 


2.2E+00| 


2.1E+00j 


2.1E+001 


2.1E+00 


| 2.1E+00 




2.0E-00 


Expression 
Signal 






Sj 






10.08 


1 




Sj 




if? 


11.83| 






8 






















ORFSEQ 
ID NO: 


25333 i 




23891 | 


% 


24973 


24974 


257071 


25853 1 


26005I 










1 




27902 


§ 


281321 


1 


s 




§ 


I 26O87I 




20934' 


SEQ ID 

NO: 


190021 


19405! 






1 




5605 








i 


7250 


1 


1 


1 


1 


1 




1 


1 


1 


6057I 


1 




1 










































Probe , 
SEQ ID ' 

NO: 


9304, 


I 


421 6 j 


1 


5276 


5276 


1 


1 


s 


61551 


I 


1 


75181 


8 


1 7783 1 


78121 


7975| 


1 






1 


8 




I 


1178 
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Top Hit Descriptor 


| Homo sapiens p22Dokdel (DOKDEL) mRNA, complete cds | 


| Oryctolagus cuniculus Na+,K+-ATPase beta 1 subunit mRNA, complete cds j 


| PUTATIVE RRNA METHYLTRANSFERASE SPB1 | 


j R.norvegicus mRNA for collagen alphal type 1 | 


| R.norvegicus mRNA for collagen alphal type 1 | 


Ihi13c05.x1 NCI CGAP GU1 Homo sapiens cDNA clone IMAGE:2972168 3' similarto gb.X01677 j 
GLYCERALDEHYDE 3-PHOSPHATE DEHYDROGENASE, LIVER (HUMAN); | 


Ihi13c05.x1 NCI CGAP_GU1 Homo sapiens cDNA clone IMAGE:2972168 3' similar to gb:X01677 I 
IGLYCERALDEHYDE 3-PHOSPHATE DEHYDROGENASE, LIVER (HUMAN); | 


j Escherichia coli 0157 DNA, map position at 46 min., complete cds | 


j Escherichia coli 0157 DNA, map position at 46 min., complete cds j 


'Escherichia coli 0157 DNA, map position at 46 min., complete cds | 


|HSPD22703 HM3 Homo sapiens cDNA clone s4000117B08 | 


] Gallus gallus mitochondrion, complete genome | 


[Mjs musculus inositol 1, 4, 5-triphosphate receptor 1 (Itpri), mRNA | 


|Mus musculus inositol 1 ,4, 5-triphosphate receptor 1 (Itpn).mRNA ■ 


1 CTD-BINDING SR-LIKE PROTEIN RA4 | 


j COLLAGEN ALPHA 2(1) CHAIN PRECURSOR | 


j COLLAGEN ALPHA 2(1) CHAIN PRECURSOR I 


] CM3-MT0114-010900-323-h12 MT01 14 Homo sapiens cDNA j 


lARGININE DEIMINASE (ADI) (ARGININE DIHYDROLASE) (AD) | 


I PROTEIN B8 PRECURSOR j 


j Synechococcus sp. PCC7942 copper transporting P-ATPase (ctaA) and ATP synthase epsilon subunit 1 
'(atpE) genes, complete cds | 


] Syrechococcus sp. PCC7942 copper transporting P-ATPase (ctaA) and ATP synthase epsilon subunit 1 
J (atpE) genes, complete cds | 


IHEDGEHOG RECEPTOR (PATCHED PROTEIN) j 


|601897854F1 NIH_MGC_19 Homo sapiens cDNA clone IMAGE:4127364 5' I 


|601893489F1 N1H_MGC_17 Homo sapiens cDNA clone IMAGE:4139038 5' j 


! EMBRYONAL FYN -ASSOCIATED SUBSTRATE (HEFS) S 


jyh72c08.r1 Scares placenta Nb2HP Homo sapiens cDNA clone IMAGE:135278 5' 


| Homo sapiens PRO0530 mRNA, complete cds | 


1 HYPOTHETICAL 75.5 KD PROTEIN IN SDH1-CIM5/YTA3 INTERGENIC REGION | 


Top Hit 
Database 


z 


z 


'SWISSPROT j 


z 


z 


EST HUMAN | 


z 


z 


z 


Z 


z 
2 
I 


z 


z 


z 


SWISSPROT [ 


SWISSPROT | 


SWISSPROT | 


EST HUMAN | 


! SWISSPROT | 


SWISSPROT [ 


z 


z 


; SWISSPROT j 


ESTJHUMAN [ 


EST_HUMAN | 


SWISSPROT 1 


r> 

X 


z 


O 
rr 

1 
% 


.1 
| 

E 


1! 

L 


1 
< 


|AF204927.1 ! 


|P25582 i 


|Z78279.1 I 




| 

1 


I 


I 
1 


I 

1 


l 
1 




I 


1 


8 
1 


JQ63627 ! 


|P02467 | 






O 


1 


s 


U04355.1 | 




1BF311999.1 I 




1043281 j 


|R31042.1 ] 






^ Most Similar 1 


BLAST E 
Value 


| 2.0E+00| 


| 2.0E+00| 


2.0E+00| 


2.0E+00| 


] 2.0E+00| 


2.0E+00 


1 


2.0E+00| 


2.0E+00| 


2.0E+00| 


2.0E+00| 




1 00+36" 


.9E+00 




! 00+36' 


00+36- 


.9E+00| 


1 00+36- 


.8E+00j 


.8E+001 




,8E+00j 


.8E+00| 


,8E+00| 


.8E+00! 






1 00+38' 






































! 

| 


I 

.83 






































s 


















3 




• c 
c 


6 
z 

Q 


| 20935 1 


| 21078 




I 21 890 | 


| 21891 [ 


23692| 


1 


26751 | 


26752| 




27237 1 


1 


















Si 






j 25723 j 


s 

8 


! 27210, 


S 27369 1 






& 


SEQ ID 
NO: 


1 






§ 


1 






IS558! 




























§ 


1 


1 5620 1 






17169 




i 




























































1 Probe 


SEQ ID 
NO: 


| 1178 






| 2102 


| 2102 




1 




e 


1 


| 7170| 


1 






1 






| 7032j 


s? 


8 


1 3082| 


1 3082 


1 


| 5712 


1 




| 7293 


1 


I 
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Top Hit Descriptor 


j Cfilamydomonas reinhardtii alternative oxidase 1 (AOX1) gene, nuclear gene encoding mitochondrial protein | 


|Rattus norvegicus Actin-related protein complex 1b (Arpclb), mRNA | 


j6019O3309F1 NIH_MGC_19 Homo sapiens cDNA clone IMAGE:4135586 5' | 


LEVANSUCRASE (BETA-D-FRUCTOFURANOSYL TRANSFERASE) (SUCROSE 6-FRUCTOSYL I 

Transferase) j 


| Homo sapiens chromosome 21 segment HS21 C080 j 


|oz43h05.x1 Soares NhHMPuSI Homo sapiens cDNA clone 1MAGE:1 6781 37 3' | 


! LEVANSUCRASE (BETA-D-FRUCTOFURANOSYL TRANSFERASE) (SUCROSE 6-FRUCTOSYL 1 
1 TRANSFERASE) | 


CM0-BT0282-1 71 299-1 27-e05 BT0282 Homo sapiens cDN A j 


! CM0-BT0282-1 71299-127-e05 BT0282 Homo sapiens cDNA | 


jCOUP TRANSCRIPTION FACTOR 1 (COUP-TF1) (COUP-TF I) j 


601894255F1 NIH_MGC_17 Homo sapiens cDNA clone IMAGE:4140084 5' ] 


IHOMEOBOX PROTEIN DLX-3 | 


IHOMEOBOX PROTEIN DLX-3 | 


1 67B7 Human retina cDNA Tsp509l-cleaved sublibrary Homo sapiens cDNA not directional j 


|tuB2d07.x1 NCI_CGAP_Gas4 Homo sapiens cDNA clone IMAGE:2257549 3' similar to contains MSR1 .t1 I 
|MSR1 repetitive element ; | 


qf50b01.x1 Soares_testis_NHT Homo sapiens cDNA clone IMAGE:1753417 3' similar to contains L1.t1 L1 I 
(repetitive element; | 


Homo sapiens lens epithelium-derived growth factor gene, alternatively spliced, complete cds 


| Homo sapiens small proline-rich protein (SPRR3) gene, exons 1 , 2, and 3 and complete cds I 


j Mus mueculus STSGalNAclll gene, exon 2 | 


B.napus gene encoding endo-polygalacturonase | 


zd25f01.r1 Soares fetal heart NbHH19W Homo sapiens cDNA clone IMAGE:341689 5' similarto I 
lgb:D29805 N-ACETYLLACTOSAM IN E SYNTHASE (HUMAN); I 


I602185095T1 NIH_MGC_45 Homo sapiens cDNA clone IMAGE4310591 3' | 


| Homo sapiens proliferation-associated SN F2-like protein (SMARCA6) mRNA, complete cds | 


| Homo sapiens proliferation-associated SNF2-like protein (SMARCA6) mRNA, complete cds ! 


Mus musculus STSGalNAclll gene, exon 2 | 


IMus musculus STSGalNAclll gene, exon 2 | 


| Brachydanio rerio MHC class II DA-beta-2*01 gene, 3' end i 


RC0-CT041 5-200700-032-C1 0 CT041 5 Homo sapiens cDNA j 


3 Hit 






1 
2 

3 


PROT | 




JMAN | 


b 
I 


JMAN [ 


JMAN | 


PROT | 


JMAN [ 


PROT j 


3ROT | 


JMAN | 


JMAN | 


JMAN | 










I 


JMAN | 














2 to n 
Q 


| NT 


z 


I 

E2 


Iswissi 


I NT 


1 


1 


I 
ul 


x 


|SWISSi 


| EST HI 


! SSIMS| 


§ 


x 


| EST HI 


X 

Eg 


NT 


I NT 


|NT 


IN 


X 
UJ 


j EST HI 


Z 


|NT 


|NT 


IN I 


| NT 


]EST_HI 


Top Hit Acession 
No. 


1 

< 


1 
§ 


JBF316805.1 


|q60114 


§ 


jAI141067.1 


§ 
0 


1 

a 




]Q9TTR8 


|BF308000.1 


S 
1 


|O60479 


jW22424.1 


AI678443.1 


1 
< 


|AF199339.1 


|AF077374.1 


|Y1 1344.1 | 


I 


W58426.1 , 


SBF570077.1 | 


8 
1 


8 
< 


IY1 1344.1 ! 


|Y11 344.1 | 




i 


Most Similar 
(Top) Hit 
BLAST E 
Value 


1 00+38' V 


1 00+38' I 


1 00+38' V 


§ 


1.7E+00| 




1 
B 


1.7E+00| 


1.7E+00| 


§ 


1.7E+00| 




1 


1 


? 


§ 


1 00+39' V 


I 


2 


I 


I 


! 


? 


1 00+39" i 


1.6E+00| 


1.6E+00| 


1 00+39' I 


1 


Expression 


4.01 1 








i 3.25| 














2.15| 




1.78| 




2.2s| 


16.73 1 


I 3.611 


2 


! 




5.14| 




s 


3.14) 






S 


ORF SEQ 
ID NO: 








1 


| 22013j 


| 22101 1 


| 24037I 


| 25424 | 


| 25425 1 


| 25673 1 




| 27355 1 


s 


| 28977 | 




| 25185| 


j 21 775 | 


1 217831 


| 21 788 1 










1 23929| 


| 24657 1 


| 24658 | 


1 




SEQ ID 
NO: 






I 




| 12110 


j 12202 




1 


! 15369| 


! 15575| 




1 










11882| 


| 118911 


1 


j 12122j 


1 


| 13854 




| 14154: 


1 


1 






Probe 
SEQ ID 
NO: 


I 


| 95041 


! 


| 1092I 


1 


| 2321| 


1 


I 


| 5448 1 


1 


| 7014j 


| 7282] 


1 7282| 


I 




1 


1 


1 


| 2003 | 


| 2238| 


1 




1 4255 


| 4255; 


| 5016| 


| 5016 


1 


1 
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Top Hit Descriptor 


Ovis aries prion protein gene, complete cds j 


Human papillomavirus type 7 genomic DNA 1 


Fugu rubripes neurofibromatosis type 1 (NF1 ), A-kinase anchor protein (AKAP84), BAW protein (BAW), and I 
WSB1 protein (WSB1 ) genes, complete cds j 


Fugu rubripes neurofibromatosis type 1 (NF1), A-kinase anchor protein (AKAP84), BAW protein (BAW), and I 
WSB1 protein (WSB1) genes, complete cds | 


Chlamydia muridarum, section 55 of 85 of the complete genome \ 


Chlamydia muridarum, section 55 of 85 of the complete genome j 


Homo sapiens Mad4 homolog (MAD4) mRNA | 


CM0-NN1 005-1 40300-285-h06 NN1005 Homo sapiens cDNA | 


CMO-NN1005-140300-286-h06 NN1005 Homo sapiens cDNA | 


602156637F1 NIH_MGC_83 Homo sapiens cDNA clone IMAGE:4297556 5' | 


PEROXISOME PROLIFERATOR ACTIVATED RECEPTOR ALPHA (PPAR-ALPHA) | 


I 

1 

i 

Q 

O 
O 

O 

s 
1 


Homo sapiens mRNA for KIAA1157 protein, partial cds j 


DNA TOPOISOMERASE III ALPHA 1 


Homo sapiens mRNA for KIAA0S05 protein, complete cds j 


SYNAPSIN II | 


SYNAPSIN II | 


Homo sapiens caveolin-1/-2 locus, Contigl, D7S522, genes CAV2 (exons 1, 2a, and 2b), CAV1 (exons 1 andl 
2; I 


Homo sapiens Xq pseudoautosomal region; segment 1/2 j 


yg33f12.r1 Soares infant brain 1NIB Homo sapiens cDNA clone IMAGE:34345 5' | 


RC1-BT0313-3012S9-012-f05 BT0313 Homo sapiens cDNA I 


zr3Se09.r1 Soares_NhHMPu_S1 Homo sapiens cDNA clone 1MAGE:665512 5' similar to contains element I 
MER22 repetitive element ; j 


Homo sapiens APECED mRNA for AIRE-1 , complete cds | 


601655184R1 NIH_MGC_65 Homo sapiens cDNA clone IMAGE:3845805 3' j 


601655184R1 NIH_MGC_65 Homo sapiens cDNA clone IMAGE:38458D5 3' S 


Pneumocystis carinii f. sp. ratti guanine nucleotide binding protein alpha subunit (pcgl) gene, complete cds | 


Pneumocystis carinii f. sp. ratti guanine nucleotide binding protein alpha subunit (pcgl) gene, complete cds 
Arabidopsis thaliana DNA chromosome 4, contig fragment No. 12 


Top Hit 
Database 


z 


z 


z 


z 


z 


z 


Z 


I 

=1 


EST HUMAN | 


EST HUMAN [ 


SWISSPROT 1 


X 

a 


S 


SWISSPROT | 


z 


SWISSPROT | 


SWISSPROT | 


z 


z 


EST HUMAN I 


EST HUMAN [ 


EST HUMAN | 


z 




EST_HUMAN | 


I- 

z 




Top Hit Acession 
No. 


1 
§ 


X74463.1 i 


£ 
< 


1 


AE002324.2 i 


AE002324.2 




1 


AW900455.1 | 


1 

£ 
m 


§ 


AW054976.1 | 


1 

§ 
< 


Q 13472 


AB020712.1 


Q92777 | 


Q92777 i 


AJ133269.1 


AJ271 735.1 


i 

8 

CrT 


1 


AA195528.1 


1 

§ 
< 


BE962107.2 


BE962107.2 


U30790.1 


U30790.1 
AL1 61 500.2 


Most Similar 
(Top) Hit 
BLAST E 
Value 










1.4E+00I 










1.4E+00| 


1.4E+00I 


1 


f 


1 


1 




? 
$ 


1 

HI 


? 








§ 
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12501 | 
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13213| 


1 


140621 


143761 


150041 


152221 
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171331 


171 79 ! 


§ 




S 






1 § 
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SEQ ID 
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§ 

St 


2634[ 


274e| 


1 


30341 
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3291 1 


s 


41621 


44821 
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Top Hit Descriptor 


M.rnucedo gene encoding 4-Dihydromethyi-trisporate dehydrogenase | 


Cantharellus sp. partial 25S rRNA gene, isolate Tibet | 


Homo sapiens putative psihHbA pseudogene for hair keratin, exons 2 to 7 1 


Homo sapiens zinc finger protein 1 57 (HZF22) (ZNF1 57) mRNA | 


Homo sapiens zinc finger protein 1 57 (HZF22) (ZNF1 57) mRNA i 


Coix lacryma-jobi dihydrodipicolinate synthase (dapA) gene, complete cds | 


Chlamydia muridarum, section 66 of 85 of the complete genome | 


Cyprinus carpio MRPb and MASPb genes for mannose-binding lectin-associated serine protease (MASP) 1 
jand MASP-related protein, complete cds | 


601661 233R1 NIH_MGC_72 Homo sapiens cDNA clone IMAGE:3915945 3' | 


Mus musculus alpha-spectrin 1, erythroid (Spnal), mRNA | 


Fugu rubripes gamma-aminobutyric acid receptor beta subunit gene, partial cds; 55kd erythrocyte membrane 
protein (P55), synaptic vesicle-associated integral membrane protein (VAMP-1), procollagen C-proteinase 
enhancer protein (PCOLCE) genes, complete c> 


Mus musculus alpha-spectrin 1 , erythroid (Spnal ), mRNA | 


Plasmodium reichenowi partial 83/AMA-1 gene for apical membrane antigen 1 | 


Plasmodium reichenowi partial 83/AMA-1 gene for apical membrane antigen 1 | 


PM0-CT0289-291 199-004-f08 CT0289 Homo sapiens cDNA \ 


PM0-CT0289-291 1 99-004-TO8 CT0289 Homo sapiens cDNA ) 


D.melanogaster no-on-transient A gene product, complete cds | 


Sus scrofa pip gene | 


601657145R1 NIH_MGC_67 Homo sapiens cDNA clone IMAGE:3866195 3' | 


Homo sapiens GL004 protein (GL004), mRNA | 


Homo sapiens heparan glucosaminyl N-deacetylase/N-sulfotransferase-2 gene, complete cds I 


S.a.ba phr-1 mRNA for photolyase | 


S.alba phr-1 mRNA for photolyase | 


Q 

O 

o 

1 

< 

I 

1 

o 
z 

1 
1 

§ 

i 

< — 

IS 

< < 

I Q 

O 2 


wo85a07.x1 NCI_CGAP_Kid1 1 Homo sapiens cDNA clone IMAGE:2462100 3' j 


601657145R1 NIH_MGC_67 Homo sapiens cDNA clone IMAGE:3866195 3' | 


Vibrio cholerae chromosome II, section 49 of 93 of the complete chromosome I 


Campylobacter jejuni kanamycin phosphotransferase (aphA-7) gene, complete cds [ 


DIHYDROPYRIMIDINASE (DHPASE) (HYDANTOINASE) (DHP) | 


MRNA 3'-END PROCESSING PROTEIN RNA15 ] 


Top Hit 
Database 


z 


z 


H 












=ST HUMAN | 










z 


EST HUMAN | 


ESTJHUMAN | 


z 


z 


EST HUMAN | 


z 




z 


z 


SWISSPROT | 


=) 
1 


EST HUMAN | 


z 


z 


SWISSPROT | 


SWISSPROT | 


Top Hit Acession 
No. 


Z73640.1 


AJ271192.1 | 


? 


i 
s 


1 


U61730.2 I 


AE002338.2 | 


IAB030447.1 




6755621| 




AF016494.1 


6755621 1 


1 

! 


1 


AW362834.1 | 




M33496.1 | 


AJ009912.1 i 




1 


I 

< 


X72019.1 | 


X72019.1 | 


000754 | 


1 
1 


3 
| 


AE004392.1 [ 


M29953.1 | 


Q14117 | 


P25299 | 


Most Similar 
(Top) Hit 
BLAST E 
Value 


! 


00+36" 


.3E+00| 


1 
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3E+00| 
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i™ 
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2.09, 
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1 


1.55 


I' 


2.31| 




1.45| 














ORFSEQ 
ID NO: 




1 

8 




1 
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22632 1 


23258 


22632 j 


24607 [ 
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1 


25889 | 


1 








1 
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1 
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27961 1 
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SEQ ID 
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Top Hit Descriptor 


1 

8 
1 

1 
1 

5 

•5) 

d 


C glutamicum pta gene and ackA gene | 


ah84g12.s1 Soares_tes«s_NHT Homo sapiens cDNA clone 1322374 3' | 


MR3-ST0191-140200-013-C05 ST0191 Homo sapiens cDNA | 


Mus musculus DSPP gene [ 


AV734585 cdA Homo sapiens cDNA clone cdAAFH03 5' ] 


L.lactis pyrD and pyrF genes ( 


Homo sapiens mRNA for K1AA1204 protein, partial cds j 


z 

Q 

i 
t 
| 

i 
§ 

! 

1 

: 


R.communis gene for pyrophosphate-dependent phosphofructokinase beta subunit | 


HUMHM01A01 Liver HeaG2 cell line. Homo sapiens cDNA clone hm01a01 | 


H sapiens EN03 gene for muscle specific enolase , | 


PMO-ST0254-161 199-001-d01 ST0264 Homo sapiens cDNA \ 


PM1-HT0422-160200-007-g10 HT0422 Homosapiens cDNA | 


Rattus norvegicus synapse-associated protein 102 mRNA, complete cds | 


Muo musculue 60 kDa ribonucleoprotein SSA/Ro gene, complete cds | 


Homo sapiens chromosome 21 segment HS21 G003 i 


Bacillus halodurans genomic DNA, section 9/14 | 


Human mRNA for KIAA0227 gene, partial cds j 


QV0-BN0042-1 70300-1 63-g1 2 BN0042 Homo sapiens cDNA | 


Ul-HF-BR0p-ajk-f-02-O-Ul.s1 NIH_MGC_52 Homo sapiens cDNA clone IMAGE:3074834 3' I 


Homo sapiens chromosome 21 segment HS21C01 3 j 


Homo sapiens chromosome 21 segment HS21C013 | 


Homo sapiens hypothetical protein FLJ10749 (FLJ10749), mRNA I 


Homo sapiens hypothetical protein FLJ1128D (FU11280), mRNA | 


wf54h1 1 -x1 Scares NFL_T GBC S1 Homo sapiens cDNA clone IMAGE:2359461 3' similar to i 
SW:P531_HUMAN Q12888 P53-BINDING PROTEIN 53BP1 ; I 


Xylella fastidiosa, section 32 of 229 of the complete genome j 


Xylella fastidiosa, section 32 of 229 of the complete genome I 


Homo sapiens calpain 9 (nCL-4) (CAPN9) mRNA | 


Homo sapiens hypothetical protein FLJ10749 (FLJ10749), mRNA I 


R.unicornis complete mitochondrial genome [ 


Drcsophila melanogaster cytoplasmic dynein heavy chain mRNA, complete cds j 


African swine fever virus, complete genome | 


Top Hit 
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NT | 


NT I 




EST HUMAN | 


NT | 


EST HUMAN | 
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EST HUMAN | 




w 
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ID NO: 
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•■ 25949 


26121 




26544 


I 


27285 
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28826 1 




28145! 


28787| 


25067| 




20217f 




21628! 


23009 j 


23010 | 
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23358' 




23565 1 




24520 
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c a .. 
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5729 1 
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S 


3426 1 


I 


3571 


3571 1 
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61 39 1 
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3862 1 


4122( 
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. 

I 


[ 
\ 


E.faecalis pbp5 gene 1 


Rattus norvegicus Aquaporin 4 (Aqp4), mRNA j 


60 1 652776R1 N I H_MGC_58 Homo sapiens cDNA clone IMAGE:3825835 3' > 


qd85c03.x1 SoaresJestisJMHT Homo sapiens cDNA clone IMAGE:1 736260 3' | 


Herpes simplex virus type 1 (strain KOS) UL41 gene | 


Herpes simplex virus type 1 (strain KOS) UL41 gene ! 


Arabidopsis thaliana DNA chromosome 4, contig fragment No. 84 j 


602082582F1 NIH_MGC_81 Homo sapiens cDNA clone IMAGE:4246628 5' j 


Homo sapiens mRNA for KIAA0934 protein, partial cds | 


Arabidopsis thaliana DNA chromosome 4, contig fragment No. 27 | 


Mus musculus guanine nucleotide binding protein (G protein), gamma 3 subunit (Gng3), mRNA | 


Homo sapiens K1AA0626 gene product (KIAA062S), mRNA | 


Klebsormidium fluitans cytochrome c oxidase subunit 2 (cox2) gene, mitochondrial gene encoding j 
j mitochondrial protein, partial cds | 


Homo sapiens hypothetical protein FLJ11280 (FLJ11280), mRNA | 


jpetroselinum crispum cytosolic glucose-6-phosphate dehydrogenase 1 (cG6PDH1 ) mRNA, complete cds | 


I Petroselinum crispum cytosolic glucose-6-phosphate dehydrogenase 1 (cG6PDH 1 ) mRNA, complete cds I 


1 
1 

a 

1 

X 

i, 
1 


LOW TEMPERATURE ESSENTIAL PROTEIN | 


Taenia solium immunogenic protein Ts76 mRNA, partial cds j 


Dictyostelium discoideum isopentenyl pyrophosphate isomerase (Dipi) mRNA, complete cds j 


Xenopus laevis rhodopsin gene, complete cds ( 


Cavia cobaya mRNA for serine/threoine kinase, complete cds i 


Marchantia polymorpha genes for 26S rRNA, 5S rRNA, 18S rRNA, 5.8S rRNA and 26S rRNA | 


Girardia tigrina mRNA for homeodomain transcription factor (so gene) j 


Homo sapiens chromosome 21 segment HS21C01 8 I 


Aedes aegypti mucin-like protein MUC1 mRNA, complete cds I 


V.oarteri Algal-CAM mRNA | 


Plautia stali intestine virus RNA for nonstructural polyprotein, capsid protein precursor, complete cds | 


DNA GYRASE SUBUNIT B j 


DNA GYRASE SUBUNIT B | 
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AF234169.1 j 


U23808.1 ( 
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AJ251 660.1 i 
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Top Hit Descriptor 


Mus musculus Serfl protein (Serfl ), survival of motor neuron protein (Smn), neuronal apoptosis inhibitory 
protein-rs6 (Naip-rs6), and neuronal apoptosis inhibitory protein-rs3 (Naip-rs3) genes, complete cds 


[3-OXO-5-ALPHA-STEROID 4-DEHYDROGENASE 1 (STEROID 5-ALPHA-REDUCTASE 1) (SR TYPE 1) | 


I3-OXO-5-ALPHA-STEROID 4-DEHYDROGENASE 1 (STEROID 5-ALPHA-REDUCTASE 1) (SR TYPE 1) | 


| HYPOTHETICAL 67.9 KD PROTEIN C6F12.08C IN CHROMOSOME I | 


jaf26g08 s1 Soares_total_fetus_Nb2HF8_9w Homo sapiens cDNA clone IMAGE:1032830 3' similar to I 
WP:C42D8.3 CE04204 ;contains element MER22 MER22 repetitive element ; | 


j Rattus norvegicus neuromedin U precursor (NmU) gene, exons 5 and 6 | 


IXenopus laevis rhodopsin gene, complete cds I 


Agaricus bisporus mRNA for tyrosinase | 


j Homo sapiens calcium channel alphal E subunit (CACNA1 E) gene, exons 7-49, and partial cds, alternatively I 
j spliced J 


|Homo sapiens hypothetical protein FLJ10139 (FLJ10139), mRNA | 


jTaenia ovis 45W antigen (ToW4) gene, complete cds j 


Rattus norvegicus mRNA for N-acetyfglucosaminyltransferase III, complete cds ( 


jBacillus subtilis 42 7kB DNA fragment from yvsA to yvqA | 


] Hordeum vulgare gene encoding cysteine proteinase | 


| Bos taurus micromolar calcium activated neutral protease 1 (CAPN1) gene, exons 1 1-20, and partial cds | 


Bos taurus micromolar calcium activated neutral protease 1 (CAPNI)gene, exons 11-20, and partial cds | 


| FIBER PROTEIN | 


l 

'I 
1 
X 

a. 1 
O 

O 
z 

5 

i 

i 


Mus musculus subtilisin-like serine protease LPC (PC7) gene, exons 1 to 9, partial cds | 


B-CELL RECEPTOR CD22 PRECURSOR (LEU-1 4) (B-LYMPHOCYTE CELL ADHESION MOLECULE) I 
[(BL-CAM) | 


Homo sapiens endothelin-converting enzyme 2 (ECE2) mRNA, complete cds [ 


| ac79b08.s1 Stratagene lung (#93721 0) Homo sapiens cDNA clone IMAGE:868791 3' | 


:601443950F1 NIH_MGC_65 Homo sapiens cDNA clone IMAGE:3848005 5' j 


I601443950F1 NIH_MGC_65 Homo sapiens cDNA clone IMAGE:3848005 5' | 


iRattus nor\ r egicus mRNA for N-acetylglucosaminyltransferase III, complete cds j 
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IZ97022.1 | 


AF248054.1 


SAF248054.1 j 




AW452782.1 | 
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IBE868267.1 i 
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Top Hit Descriptor 


PEROXISOMAL HYDRATASE-DEHYDROGENASE-EPIMERASE (HDE) (MULTIFUNCTIONAL BETA- 
OXIDATION PROTEIN) (MFP) [INCLUDES: 2-ENOYL-COA HYDRATASE ; D-3-HYDROXYACYL COA 
DEHYDROGENASE ] 


PEROXISOMAL HYDRATASE-DEHYDROGENASE-EPIMERASE (HDE) (MULTIFUNCTIONAL BETA- 
OXIDATION PROTEIN) (MFP) [INCLUDES: 2-ENOYL-COA HYDRATASE ; D-3-HYDROXYACYL COA 
DEHYDROGENASE] 


|RC1-HT0229-181099-01 1-e06 HT0229 Homo sapiens cDNA j 


] Human immunodeficiency virus type 1 (HlV-1), Isolate SF33, j 


|S01497581 F1 NIH MGC70 Homo sapiens cDNA clone 1MAGE:389S421 5' | 


j Mus musculus chloride channel calcium activated 1 (Clcal ), mRNA i 


] Mus musculus chloride channel calcium activated 1 (Clcal ), mRNA \ 


1 AV689554 GKC Homo sapiens cDNA clone GKCCYA1 1 5' | 


Xenopus laevis zona pellucida C glycoprotein precursor (xlZPC) mRNA, complete cds I 


jXenopus laevis zona pellucida C glycoprotein precursor (xlZPC) mRNA, complete cds j 


! AV758825 BM Homo sapiens cDNA clone BMFAW C04 5' | 


jzh94a02.r1 Soares_fetal_liver_spleen_1NFLS_S1 Homo sapiens cDNA clone IMAGE:428906 5' | 


i 

i 
□ 

| 

i 
i 

| 


1 THROMBOMODULIN PRECURSOR (FETOMODULIN) (TM) | 


j EST388293 MAGE resequences, MAGN Homo sapiens cDNA | 


j Homo sapiens mRNA for KIAA1 51 7 protein, partial cds j 


| Drosophila melanogaster regulator of G-protein signalling LOCO III mRNA, complete cds j 


;D-osophila melanogaster regulator of G-protein signalling LOCO III mRNA, complete cds j 


| Homo sapiens chromosome 21 segment HS21C102 j 


[Apple mosaic virus RNA 2 putative polymerase gene, complete cds | 


■ SERINE/THREONINE PROTEIN KINASE MINIBRAIN j 


j Lycopersicon esculentum putative Mil copy 1 nematode-resistance gene j 


IB2 BRADYKININ RECEPTOR (BK-2 RECEPTOR) | 


| Danio rerio mRNA for Eph-like receptor tyrosine kinase rtk8 [ 


|AMlNO-ACID ACETYLTRANSFERASE (N-ACETYLGLUTAMATE SYNTHASE) (AGS) (NAGS) [ 


|601653583R2 NIH_MGC_55 Homo sapiens cDNA done IMAGE:3838461 3' ] 


I601653583R2 NIH_MGC_55 Homo sapiens cDNA clone IMAGE:3838461 3' | 


I Enterobacteriaceae sp. JM983 partial groES gene for GroES-like protein and partial groEL gene for GroEL- I 
! like protein, isolate JM983 I 
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|AF245455.1 | 


|AF245455.1 


| AL1 63302.2 


1 


s 

2 




|Q28642 | 


|AJ005029.1 


|P22567 


|BE957439.2 | 


|BE957439.2 


|aJ302158.1 
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1 


] 27610! 


| 28005| 


j 28075| 


| 28076| 










| 21316| 


I 22360 1 




| 25441 | 








| 20265| 




| 23430 1 


| 26292| 


SEQ ID 
NO: 


16740 


I6740 


I 


S 




17320 


17320 


I7395( 




1 7397 1 




17835 


17835 






19350 1 


| 


114581 


I2467 


1 


153811 








I0454 


I3S44| 


1 3644 1 


1 






















































Probe 
SEQ ID 

NO: 


6861 


1 


1 


| 7064[ 


1 


1 


1 


| 7544 [ 


1 


I 7546I 


1 


1 


1 


8 

5 


I 95131 




1 


| 1553[ 


1 


8 
8 


I 


ft 

R 


| 7510| 


& 
S 




1 


| 3732| 





t20 



WO 01/57274 



PCT/US01/00666 



Top Hit Descriptor 


Enterobacteriaceae sp. JM983 partial groES gene for GroES-like protein and partial groEL gene for GroEL- 1 
like protein, isolate JM983 j 


601110258F1 NIH_MGC_16 Homo sapiens cDNA clone IMAGE:3350750 5' | 


601110258F1 NIH_MGC_16 Homo sapiens cDNA clone IMAGE:3350750 5' | 


Homo sapiens X28 region near ALD locus containing dual specificity phosphatase 9 (DUSP9), ribosomal 
protein L18a(RPL18a), Ca2+/Calmodulin-dependent protein kinase I (CAMKl), creatine transporter (CRTR), 
CDM protein (CDM), adrenoleukodystrophy protein > 


8 

1 
o 
1 

1 
< 

D 

H 

1 

i 

X 

z 

1 

8 
3 

§ 
I 


Drosophila melanogaster sodium channel prolein (para) gene, exons 9,10,1 1,12 and optional segments b, c, d| 
and e, partial cds | 


Triticum aestivum stripe rust resistance protein Yr1 0 (Yr10)gene, complete cds I 


Salmonella typhimurium adenine-methyltransferase (mod) and restriction endonuclease (res) | 


UI-H-BI4-aoi-e-07-0-Ul.s1 NCLCGAP_Sub8 Homo sapiens cDNA clone IMAGE:3085140 3' | 


Botrytis cinerea strain T4 cDNA library under conditions of nitrogen deprivation | 


Bromus inermis putative cytosolic phosphoglucomutase (pgml) mRNA, complete cds | 


Bromus inermis putative cytosolic phosphoglucomutase (pgml ) mRNA, complete cds ! 


PM2-UM0053-240300-035-f1 2 UM0053 Homo sapiens cDNA | 


Homo sapiens KIAA0914 gene product (KIAA0914), mRNA | 


Parvovirus B1 9 DNA, patient C, genome position 2448-2994 j 


Parvovirus B1 9 DNA, patient C, genome position 244B-2994 | 


P.falciparum complete gene map of plastid-like DNA (IR-A) I 


AV752605 NPD Homo sapiens cDNA clone NPDBAG06 5' I 


AV752605 NPD Homo sapiens cDNA clone NPDBAG06 5' S 


Sphyrna tiburo NADH dehydrogenase subunit 2 (NADH2) gene, mitochondrial gene encoding mitochondrial 1 
protein, partial cds ! 


601675S39F1 NIH_MGC_21 Homo sapiens cDNA clone IMAGE:3958473 5' j 


601675639F1 NIH MGC 21 Homo sapiens cDNA clone IMAGE:3958473 5' | 


RC1-CT0295-241 199-01 1-b02 CT0295 Homo sapiens cDNA 


Bartonella clarridgeiae RNA polymerase beta subunit (rpoB) gene, partial cds | 


PiTpinella brachycarpa zinc finger protein (ZFP1) mRNA, complete cds | 


601456703F1 NIH_MGC_67 Homo sapiens cDNA clone IMAGE:3869929 5' 


Homo sapiens epidermal growth factor receptor (avian erythroblasfic leukemia viral (v-erb-b) oncogene 


homolog) (EGFR), mRNA \ 


Homo sapiens phytanoyl-CoA hydroxylase (PHYH) gene, exon 5 j 
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Top Hit Descriptor 


Fugu rubripes neural cell adhesion molecule L1 homolog (L1-CAM) gene, complete cds; putative protein 1 
;(PUT1 ) gene, partial cds; mitosis-specific chromosome segregation protein SMC1 homolog (SMC1 ) gene, 
complete cds; and calcium channel alpha-1 subunit? 


Rabbit MHC fragment RLA-DF DNA j 


Xylella fastidiosa, section 90 of 229 of the complete genome ' 


Chlamydophila pneumoniae AR39, section 21 of 94 of the complete genome j 


PUTATIVE F420-DEPENDENT NADP REDUCTASE j 


| M.aeruginosa (HUB 5-2-4) DNA from plasmid PMA1 | 


Synechocystis sp. PCC6803 complete genome, 13/27, 1576593-1719643 | 


Homo sapiens SOS1 (SOS1 ) gene, partial cds | 


Homo sapiens AT-binding transcription factor 1 (ATBF1), mRNA j 


nn05f1 1.s1 NCI_CGAP_Pr4.1 Homo sapiens cDNA clone IMAGE:1076877 | 


Pseudomonas aeruginosa topoisomerase (top), putative transcriptional regulatory protein OhbR (ohbR), ortho- 
halobenzoate 1 ,2-dioxygenase beta-ISP protein OhbA (ohbA), OhbC (ohbC), ortho-halobenzoate 1 ,2- 
dioxygenase alpha-ISP protein OhbB (ohbB), and put> 


Pseudomonas aeruginosa PA01 , section 524 of 529 of the complete genome | 


QV0-NN1021-100800-337-c03NN1021 Homo sapiens cDNA | 


601S23SS4R1 NIH_MGC_79 Homo sapiens cDNA clone IMAGE.-4043564 3' j 


601823684R1 NlH_MGC_79 Homo sapiens cDNA clone IMAGE:4043564 3' I 


AV661898 GLC Homo sapiens cDNA clone GLCGYG07 3' \ 


Rat IGFII gene for insulin-like growth factor II | 


Zd44e03.r1 Soares_fetal_heart_NbHH19W Homo sapiens cDNA clone IMAGE:343516 5' j 


Homo sapiens cytochrome P450, subfamily XXVI IA (steroid 27-hydroxylase, cerebrotendinous ) 
xanthomatosis), polypeptide 1 (CYP27A1b) mRNA j 


Arabidopsis thaliana DNA chromosome 4, contig fragment No. 65 j 


Drosophila melanogaster merlin (Dmerlin) mRNA, complete cds j 


Chicken lipoprotein lipase gene j 


Chicken lipoprotein lipase gene | 


Grus canadensis recombination activating protein 1 (RAG-1) gene, partial cds 5 


Grus canadensis recombination activating protein 1 (FlAG-1) gene, partial cds ) 


Bacillus halodurans genomic DNA, section 12/14 | 


601067107F1 NIH_MGC_10 Homo sapiens cDNA clone IMAGE:3453505 5' j 


Cyanidium caldarium gene for SigC, complete cds | 


Cyanidium caldarium gene for SigC, complete cds | 
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Top Hit Descriptor 


Homo sapiens human immunodeficiency virus typel enhancer-binding protein 1 (HIVEP1), mRNA | 


Rattus norvegicus protein tyrosine phosphatase, non-receptor type 5 (Ptpn5), mRNA | 


Mus musculus mperl gene for periodl , complete cds j 


Mus musculus NK cell receptor 2B4 gene, promoter region and partial cds j 


Human fibroblast growth factor receptor 3 (FGFR3) gene, intron 7 | 


Human fibroblast growth factor receptor 3 (FGFR3) gene, intron 7 j 


Pyrococcus abyssi complete genome; segment 5/6 1 


Thermus thermophilus cytochrome c-552 (cycA) and CycB (cycB) genes, complete cds j 


Arabidopsis thaliana DNA chromosome 4, contig fragment No. 18 I 


Nicotiana tabacum mRNA for chloroplast ribosomal protein L10, complete cds j 


de biosynthetic gene cluster | 


I DRB pseudogene DRB2 f 


1 DRB pseudogene DRB2 j 


I DRB pseudogene DRB2 j 


§ 

I 


Arabidopsis thaliana DNA chromosome 4, contig fragment No. 40 j 


nn01f12.y5 NCI_CGAP_Co9 Homo sapiens cDNA clone IMAGE:1076495 5' similar to contains THR.t1 THR I 
repetitive element ; . j 


Drosophila melanogaster Lis1 homolog mRNA, complete cds j 


Mus musculus neuro-d4 gene, exons 3 through 12 and partial cds j 


Methanobacterium thermoautotrophicum from bases 1270510 to 1 283409 (section 109 of 148) of the j 
complete genome | 


Phytophthora infestans mitochondrion, complete genome j 


iHomo sapiens FRA3B common fragile region, diadenosine triphosphate hydrolase (FHIT) gene, exon 5 | 


Rattus norvegicus mRNA for RPHO-1 , complete cds ] 


Mus musculus trophinin (Tnn) gene, complete cds ) 


IL3-CT021 9-161 1 99-031-C08 CT0219 Homo sapiens cDNA j 


Rattus norvegicus mRNA for RPHO-1 , complete cds | 


CM4-HT0243-081 1 99-037-e01 HT0243 Homo sapiens cDNA I 


S.cerevisiae MET, LEU4, and POL1 genes encoding MET4 protein, alpha-lsoproplymalate (alpha-IPM) I 
synthetase (partial), and DNA polymerase alpha (partial) | 


Homo sapiens thioredoxin-related protein mRNA, complete cds I 


MCKUSICK-KAUFMAN/BARDET-BIEDL SYNDROMES PUTATIVE CHAPERONIN | 


Streptomyces antibioticus polyketi 


Canis familiaris MHC DLA Class I 


Canis familiaris MHC DLA Class 1 


Canis familiaris MHC DLA Class I 


Canis familiaris MHC DLA Class I 
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Top Hit Descriptor 


MCKUSICK-KAUFMAN/BARDET-BIEDL SYNDROMES PUTATIVE CHAPERONIN | 


Molluscum contagiosum virus type 1 ORF1 and ORF2 DNA | 


OVARIAN TUMOR LOCUS PROTEIN [ 


ywl 4d02.r1 Soares_placenta_8to9weeks_2NbHP8to9W Homo sapiens cDNA clone IMAGE:252195 5' I 
similar to gb:M36072 60S RIBOSOMAL PROTEIN L7A (HUMAN); | 


Mus musculus mRNA for NIPSNAP2 protein j 


Mus musculus TANK binding kinase TBK1 (Tbk1) mRNA, complete cds | 


Homo sapiens MHC class 1 region I 


Homo sapiens MHC class 1 region | 


NEURONAL MEMBRANE GLYCOPROTEIN M6-B 


NEURONAL MEMBRANE GLYCOPROTEIN M6-B | 


RCO-TN0080-220800-025-d1 0 TN0080 Homo sapiens cDNA J 


RC0-TN0080-220800-025-d1 0 TN0080 Homo sapiens cDNA j 


Thermotoga maritima section 23 of 136 of the complete genome ( 


Staphylococcus aureus partial pta gene for phosphate actyltransferase allele 15 j 


Bos taurus futb and rtlf genes j 


| 

1 

1 

1 
X 

& 
1 

z 

1 
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Saimiri boliviensis olfactory receptor (SBQ27)gene, partial cds | 


Mus musculus gene for oviductal glycoprotein, complete cds j 


Neisseria meningitidis serogroup A strain Z2491 complete genome; segment 7/7 j 


G.gallus mRNA for nicotinic acetylcholine receptor (nAChR) beta 3 subunit j 


;RC0-NN1012-270300-O21-h06 NN1012 Homo sapiens cDNA j 


Lymantria dispar nuclear polyhedrosis virus gene for DNA polymerase, complete cds | 


jUreaplasma urealyticum section 31 of 59 cf the complete genome | 


Homo sapiens mRNA for KIAA1452 protein, partial cds | 


jHaemophilus influenzae Rd section 54 of 163 of the complete genome ! 


| Oryctolagus cuniculus mRNA for mitsugumin29, complete cds j 


1 Danio rerio Trp4-associated protein Tap1 A (tapl A) mRNA, complete cds j 


j Gallus gallus SOX8 transcription factor (SOX8) mRNA, complete cds | 


|601 192033F1 NIH_MGC_7 Homo sapiens cDNA clone IMAGE:3535785 5' | 


] Mus musculus embigin (Emb), mRNA . j 


! Mus musculus embigin (Emb), mRNA j 


jS. pneumoniae dexB, cap3A, cap3B and cap3C genes and orfs | 


|S. pneumoniae dexB, cap3A, cap3B and cap3C genes and orfs | 
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Top Hit Descriptor 


| Human insulin receptor (allele 2) gene, exons 14, 15, 16 and 17 | 


1 
S 

1 


| G:ardia lamblia vsnsnt-specific surface protein G3M-B (vspG3M-B) mRNA, partial cds j 


|SMALL HYDROPHOBIC PROTEIN | 


| Homo sapiens KIAA1 072 protein (KIAA1 072), mRNA | 


| NEURAL-CADHERIN PRECURSOR (N-CADHERIN) ] 


j HSC1 KH041 normalized infant brain cDNA Homo sapiens cDNA clone c-1 kh04 j 


EST371 637 MAGE resequences, MAGF Homo sapiens cDNA I 


J Rattus norvegicus transmembrane receptor Unc5H1 mRNA, complete cds | 


ISphenodon punctatus alpha enolase mRNA, partial cds 1 


I D.discoideum racGAP gene ; 


jArabidopsis thaliana 1 -amino-1 -cyclopropanecarboxvlate synthase (ACS5) gene, complete cds | 


| Lycopersicon hirsutum ADP-glucose pyrophosphorylase large subunit (AGP-L1 ) mRNA, complete cds | 


Mus musculus major histocompatibility locus class II region: major histocompatibility protein class II alpha 
chain (lAalpha) and major histocompatibility protein class II beta chain (lEbeta) genes, complete cds; 
butyrophilin-like (NG9), butyrophilin-li> 


] CITRATE SYNTHASE | 


Homo sapiens UDP-N-acetyl-alpha-D-galactosamine:polypeptide N-acetylgalactosaminyllransferase 7 
(GalNAc-T7) (GALNAC-T7), mRNA 


Coturnix coturnix japonica sub-species japonica beta-actin mRNA, partial cds | 


| Coturnix coturnix japonica sub-species japonica beta-actin mRNA, partial cds 


]RA-F:NOSE INVERTASE (INVERTASE) 


| RA=F;NOSE INVERTASE (INVERTASE) j 


j yf24b02. s'. Soares fetal iver spleen 1 N FLS Homo sapiens cDNA clone IMAGE:1 27755 3' , 


|Lycopersicon hirsutum ADP-glucose pyrophosphorylase large subunit (AGP-L1 ) mRNA, complete cds 


Archaeoglcbus fulgidus, complete genome 


lArabidopsis thaliana 3-methylcrotonyl-CoA carboxylase non-biotinyiated subunit (MCCB) mRNA, complete 1 
|cds | 


jArabidopsis thaliana 3-methylcrotonyl-CoA carboxylase non-biotinyiated subunit (MCCB) mRNA, complete | 
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|M29930.1 
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| AF1 84345.1 | 




AF059510.1 
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BLAST E 
Value 
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Top Hit Descriptor 


Mus musculus neuromedin U precursor (Nmu) gene, partial cds; tPhLP (Tphlp) gene, partial cds; CLOCK 
(Clock) gene, complete cds; PFT27 (Pft27) gene, complete cds; and H5AR (H5ar) gene, complete cds 


Mus musculus advillin (Advil-pending), mRNA | 


Mus musculus advillin (Advil-pending), mRNA | 


Mus musculus cytochrome P450, 2b9, phenobarbitol inducible, type a (Cyp2b9), mRNA I 


JMUSCARINIC ACETYLCHOLINE RECEPTOR M2 j 


jMUSCARINIC ACETYLCHOLINE RECEPTOR M2 j 


jH. aspersa mRNA for neurofilament NF70 j 


| H.aspersa mRNA for neurofilament NF70 | 


Arabidopsis thaliana DNA chromosome 4, contig fragment No. 88 j 


Homo sapiens mRNA for KIAA0895 protein, partial cds j 


Homo sapiens chromosome 21 segment HS21 C1 01 | 


jHomo sapiens FRA3B common fragile region, diadenosine triphosphate hydrolase (FHIT) gene, exon 5 | 


5 
1 

§ 

i 

a 

1 

i 

i 
1 

E 

1 

a 
1 
1 

! 

o 
5 


|Homo sapiens hypothetical protein FLJ10793 (FLJ10793), mRNA I 


Homo sapiens dentin sialophosphoprotein precursor (DSPP) gene, complete cds I 


Synechocystis sp. PCC6803 complete genome, 9/27, 1 056467-1 1 88885 i 


j Methanobacterium themnoautotrophicum from bases 317350 to 328792 (section 29 of 148) of the complete I 
j genome | 


1n14b09.x1 NCI_CGAP_Bm25 Homo sapiens cDNA clone IMAGE:21 67577 3' similar to contains Alu j 
j repetitive element;contains element MIR repetitive element ; I 


I Homo sapiene mRNA for KIAA0534 protein, partial cds ( 


Malva pusilla actin (Actl ) mRNA, complete cds j 


Homo sapiens chromosome 21 segment HS21 C046 j 


|601573026F1 NIH_MGC_9 Homo sapiens cDNA clone IMAGE:3834174 5' | 


j Mus musculus complement component 1 inhibitor (C1nh), mRNA I 


j ta1 3h01 .x1 NCI_CGAP_Lym5 Homo sapiens cDNA clone IMAGE:2043985 3' > 


j Borrelia burgdorferi (section 52 of 70) of the complete genome : 


Homo sapiens HT017 mRNA, complete cds | 


; Homo sapiens growth arrest-specific 7 (GAS7), transcript variant b, mRNA j 


| Mus musculus antigen (CD72) gene | 


| Mus musculus antigen (CD72) gene • [ 
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Top Hit Descriptor 


Iyz73e07.s1 Soares_multiple_sclerosis_2NbHMSP Homo sapiens cDNA clone IMAGE:288708 3' simila 
] contains Alu repetitive element; 


jyz73e07.s1 Soares_multiple_sclerosis_2NbHMSP Homo sapiens cDNA clone IMAGE:288708 3' simila 
j contains Alu repetitive element; 


Homo sapiens chromosome 21 segment HS21C101 


! Escherichia coli K-12 MG1655 section 1 43 of 400 of the complete genome 


j AV763842 MDS Homo sapiens cDNA clone MDSCHE04 5' 


IAV763842 MDS Home sapiens cDNA clone MDSCHE04 5' 


| Candida albicans squalene epoxidase (CAERG1) gene, complete cds and translational regulator gene, f 


j Candida albicans squalene epoxidase (CAERG1) gene, complete cds and translational regulator gene, f 


1 

1 

< 
n 

.1 
8 
1 

X 

I 
z 

I 

§ 


Cnlamydia muridarum, section 3 of 85 of the complete genome 


I Mus musculus mRNA for immunoglobulin gamma heavy chain variable region, isolate PC 281 1 


| Arabidopsis thaliana DNA chromosome 4, contlg fragment No. 69 


Arabidopsis thaliana DNA chromosome 4, contig fragment No. 69 


:Homo sapiens DAN gene, complete cds 


Homo sapiens DAN gene, complete cds 


| FORKHEAD BOX PROTEIN C2 (FORKHEAD-RELATED PROTEIN FKHL14) (MESENCHYME FORI 
HEAD PROTEIN 1) (MFH-1 PROTEIN) (TRANSCRIPTION FACTOR FKH-14) 


Giardia intestinalis carbamate kinase gene, complete cds 


Synechocystis sp. PCC6803 complete genome, 27/27, 3418852-3573470 


aj75a05.s1 Soares_parathyroid tumor NbHPA Homo sapiens cDNA clone IMAGE: 1402256 3' similar t 
gb:X56411_rna1 ALCOHOL DEHYDROGENASE CLASS II PI CHAIN (HUMAN); 


Rat(hooded) prolactin gene : exon Hi and flanks 


Homo sapiens hevin (HEVIN) mRNA 


Homo sapiens mRNA for KIAA1345 protein, partial cds 


Stagonospora avenae bgll gene for beta-glucosidase, exons 1-4 


1 

3 
§ 

I 

£ 
1 
1 

1 
1 


Mus musculus zinc finger protein (Peg3) mRNA, complete cds 


Mus musculus zinc finger protein (Peg3) mRNA, complete cds 


Anopheles gambiae strain M2 translation initiation factor 4C (1A) (elF-4C) mRNA, complete cds 
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Top Hit Descriptor 


jHomo sapiens nuclear factor of kappa light polypeptide gene enhancer in B-cells 1 (NFKB1) gene, complete I 
|cds ) 


IHomo sapiens nuclear factor of kappa light polypeptide gene enhancer in B-cells 1 (NFKB1) gens, complete I 
|cds J 


: zx12g12.s1 Soares_total_fetus_Nb2HF8_9wHomo sapiens cDNA clone IMAGE:786310 3' similar to I 
'contains element TAR1 repetitive element ; j 


Drosophila melanogaster Mst85C gene, complete cds; NMDMC Isoform (Nmdmc) gene, complete cds, 
alternatively spliced; and transcription factor (Relish) gene, complete cds, alternatively spliced 


Mus musculus Wiskott-Aldrich syndrome protein (Wasp), mRNA j 


S.tuberosum mRNA for glucose-6-phosphate dehydrogenase | 


xa95g12.x1 NCI_CGAP_Co17 Homo sapiens cDNA clone IMAGE:2574598 3' [i 


Dendrobium fimbriatum mRNA for phosphoenolpyruvate carboxylase, partial j 


Helicobacter pylori, strain J99 section 47 of 132 of the complete genome | 


Gallid herpesvirus 2, complete genome j 


Gallid herpesvirus 2, complete genome j 


Pseudomonas aeruginosa PA01 , section 167 of 529 of the complete genome | 


Helicobacter pylori, strain J99 section 47 of 132 of the complete genome 5 


CM3-HT0769-010600-197-C03 HT0769 Homo sapiens cDNA | 


N-ACETYLGLUCOSAMINYL-PHOSPHATIDYLINOSITOL BIOSYNTHETIC PROTEIN GPI1 | 


Homo sapiens SLIT1 protein (SLIL2) mRNA, partial cds , | 


Homo sapiens lens epithelium-derived growth facto' gene, alternatively spliced, complete cds ; 


Homo sapiens sema domain, seven thrombospondin repeats (type 1 and type 1-like), transmembrane domain I 
(TM) and short cytoplasmic domain, (semaphorin) 5A (SEMA5A) mRNA | 


1 
f 

1 
I 

I 

3 


Human hereditary haemochromatosis region, histone 2A-like protein gene, hereditary haemochromatosis 
(HLA-H) gene, RoRet gene, and sodium phosphate transporter (NPT3) gene, complete cds 


Mus musculus kinesin light chain 2 (Klc2), mRNA 1 


AV660506 GLC Homo sapiens cDNA clone GLCGID04 3' j 


Homo sapiens chromosome 21 segment HS21 C078 j 


Vibrio cholerae chromosome II, section 39 of 93 o* the complete chromosome | 


H.vulgaris Na,K-ATPase alpha subunit mRNA, complete cds | 


H.vulgaris Na,K-ATPase alpha subunit mRNA, complete cds [ 
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Top Hit Descriptor 


! Yus musculus gene for Tob2, complete cds 


|Homo sapiens interleukin 10 receptor, alpha (iLIORA) mRNA 


|Homo sapiens SPP2 gene for secreted phosphoprotein 24 precursor, exons 1-8 


| uab dop is i .i la DNA c r-mocome 4, contig fragment No. 39 


| Phaseolus vulgaris ATPase gamma subunit mRNA, nuclear gene encoding mitochondrial prote 


lAcetobacter xylinum putative ATP binding protein delta-AceB gene, partial cds, and GDP-manr 
j diphosphopolyprenol alpha-mannosyltransferse gene, complete cds 


Chicken mRNA for 115-kDa melanosomal matrix protein, complete cds 


j Mus musculus small GTP-binding protein RAB25 (Rab25) gene, complete cds 


yw17f06.r1 Soares_placenta_8to9weeks_2NbHP8to9W Homo sapiens cDNA clone IMAGE:25 


no15c07.s1 NCl_CGAP_Phe1 Homo sapiens cDNA clone IMAGE: 1 100748 3' 


IAU138078 PLACE1 Homo sapiens cDNA clone PLACE1007810 5' 


Plasmodium berghei cytochrome c oxidase subunit III, cytochrome c oxidase subunit 1, and cytc 
jgenes, mitochondrial genes encoding mitochondrial proteins, complete cds 


hv74a10.x1 NCI_CGAP_Lu24 Homo sapiens cDNA clone IMAGE:3179130 3' 


IS.cerevisiae chromosome IV reading frame ORF YDL097C 


, Drosophila melanogaster 8kd dynein light chain mRNA, complete cds 


iMus musculus dystroglycan 1 (DAG1) gene, exons 1 and 2 and complete cds 


Homo sapiens mRNA for KIAA1607 protein, partial cds 


IM.musculus whn gene 


M.musculus whn gene 


Treponema pallidum section 63 of 87 of the complete genome 


Homo sapiens ataxia telangiectasia (ATM) gene, complete cds 


502150289F1 NIH_MGC_81 Homo sapiens cDNA clone IMAGE:4291126 5' 


; AV759212 MDS Homo sapiens cDNA clone MDSCGC09 5' 


Rat cytomegalovirus Maastricht, complete genome 


HISTIDINE-RICH PROTEIN PRECURSOR (CLONE PFHRP-lll) 


Haemophilus influenzae Rd section 4 of 163 of the complete genome 


Shigella flexneri multi-antibiotic resistance locus 


Gallus gallus bone morphogenetic protein 1 (BMP1) mRNA, partial cds 


Gallus gallus bone morphogenetic protein 1 (BMP1) mRNA, partial cds 


Uycopersicon esculentum p69a gene, complete CDS 
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Top Hit Descriptor 


D , melanogaster mRNA for metabotropic glutamate receptor j 


601676889F1 NIH_MGC_21 Homo sapiens cDNA clone IMAGE:3959351 5' | 


glycoprotein Ilia {Alu 1 and 3 fusion junction} [human, Genomic Mutant, 300 nt] ( 


Variola virus, complete genome j 


Variola virus, complete genome | 


Escherichia coli K-12 MG1 655 section 203 of 400 of the complete genome [ 


nr09h06.s1 NCI_CGAP_Co10 Homo sapiens cDNA clone IMAGE:1 161371 3' similar to TR:O02916 O0291 S| 
HLARK.; j 


CM-BT043-090299-U4S BT043 Homo sapiens cDNA j 


HYPOTHETICAL 13.7 KD PROTEIN IN IN01-IDS2 INTERGENIC REGION J 


HYPOTHETICAL 15.3 KD PROTEIN IN VMA12-APN1 INTERGENIC REGION ! 


Mus museulus keratin complex 2, gene 6g (Krt2-6g), mRNA | 


Homo sapiens 3'-phosphoadenosine 5'-phosphosulfate synthetase (PAPSS) mRNA, complete cds j 


ClimicolapscD gene J 


HYPOTHETICAL 142.5 KD PROTEIN C23E2 02 IN CHROMOSOME I j 


Mus museulus calcium-sensing receptor related protein 4 (Casr-rs4) mRNA, partial cds j 


ys01e08.s1 Soares fetal liver spleen 1 NFLS Homo sapiens cDNA clone IMAGE:213542 3' | 


6013361 46F1 NIHJvlGC_44 Homo sapiens cDNA clone 1MAGE:369001 0 5' | 


Human pulmonary surfactant-associated protein SP-B (SFTP3) mRNA, complete cds | 


Arabidopsis thaliana DNA chromosome 4, contig fragment No. 23 | 


NON-STRUCTURAL POLYPROTEIN [CONTAINS: RNA-DIRECTED RNA POLYMERASE ; THIOL j 
PROTEASE P3C ; HELICASE (2C LIKE PROTEIN); COAT PROTEIN] j 


NON-STRUCTURAL POLYPROTEIN [CONTAINS; RNA-DIRECTED RNA POLYMERASE ; THIOL 1 
PROTEASE P3C ; HELICASE (2C LIKE PROTEIN); COAT PROTEIN] | 


Mus museulus secreted acidic cysteine rich glycoprotein (Sparc), mRNA | 


Rattus norvegicus dihydroxypolyprenylbenzoate methyltransferase mRNA, complete cds j 


Rattus norvegicus dihydroxypolyprenylbenzoate methyltransferase mRNA, complete cds ] 


Rat TRPM-2 gene, complete cds j 


Rat TRPM-2 gene, complete cds I 


Arabidopsis thaliana putative zinc transporter (ZIP1) mRNA, complete cds | 


Homo sapiens mitogen-activated protein kinase kinase kinase kinase 4 (MAP4K4), mRNA \ 


Homo sapiens mitogen-activated protein kinase kinase kinase kinase 4 (MAP4K4), mRNA ) 


Homo sapiens G-protein coupled receptor EDG-7 mRNA, complete cds | 


Homo sapiens G-protein coupled receptor EDG-7 mRNA, complete cds | 


Top Hit 
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NT | 
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z 


NT i 


NT ■ | 


NT | 
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1 

I 
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1 
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SWISSPROT I 
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NT | 
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z 


NT | 
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i 




NT | 


NT i 


z 


NT | 


NT | 


NT | 


NT | 


NT | 


NT | 


TopHitAcession 
No. 




|BE902044.1 I 


IS62927.1 | 


9327521 | 


9627521| 


AE000313.1 | 


|aA877715.1 | 


i 

I 


IP47003 | 


P3S073 | 


1 


AF1 05227.1 | 




Q10135 i 




H72255.1 | 


1 
1 


1 
5 


AL161511.2 i 


|p27410 I 




6678076 | 


L20427.1 i 


L20427.1 ] 


M64733.1 | 


M64733.1 | 


AF033535.1 ; 




11431065| 


AF236117.1 j 


AF236117.1 j 


Most Similar 
(Top) Hit 
BLAST E 
Value 


6.3E-01 


6.3E-01 


3 


6.3E-01 


6.3E-01 


6.3E-01 


6.3E-01 


1 


6.3E-01 


6.3E-01 


6.3E-01 


6.3E-01 


6.3E-01 


6.2E-01 


6.2E-01 


6.2E-01 


6.2E-01 




6.2E-01 


1 


1 


s 
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6.1E-01 


6.1E-01 


6.1E-01 


6.1E-01 
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.6.1E-01 


6.1E-01 
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s 






0.99| 
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27495 1 
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| 28543 1 






1 


1 
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24577 1 
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271 76 | 


27492 | 
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SEQ ID 
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s 
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7421 1 


74211 


8004 1 


| 8413| 


I 


8725 [ 
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9128| 


9220 1 


9435 1 
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6831 1 
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7799 1 


1 8007 1 
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s 


61041 


6792 1 
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71081 
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Top Hit Descriptor 


Homo sapiens dopamine transporter (SLC6A3) gene, complete cds 


Homo sapiens DNAfor amyloid precursor protein, complete cds 


Homo sapiens adaptor-related protein complex 3, mu 2 subunit (CLA20), mRNA 


Human respiratory syncytial virus strain CH93-53b attachment protein (G) gene, complete cds 


Viral hemorrhagic septicemia virus N, P, M, G, Nv, L genes, French strain 07-71 


jD(2) DOPAMINE RECEPTOR 


|U;-H-BI1-aeb-a-10-0-Ul.s1 NCI_CGAP_Sub3 Homo sapiens cDNA clone IMAGE:271S619 3' 


| Musca domestica insecticide-susceptible strain voltage-sensitive sodium channel mRNA, comr. 


Homo sapiens partial LM01 gene for LIM domain only 1 protein, exon 1 


SEGMENTATION PROTEIN FUSHI TARAZU 


SEGMENTATION PROTEIN FUSHI TARAZU 


Homo sapiens genes for leukotriene B4 receptor BLT2, leukotriene B4 receptor BLT1 , complefe 


i PEROXISOMAL MEMBRANE PROTEIN PER9 (PEROXIN-3) 


Gallus gallus mRNA for Hyperion protein, 419 kD isoform 


I GaDus gallus mRNA for Hyperion protein, 419 kD isoform 


1 

1 

1 

1 
< 

I 

a 
1 

1 
0 

0 

z 

1 


i Homo sapiens nuclear factor (erythroid-derived 2)-like 3 (NFE2L3), mRNA 


|Mus musculus cGMP-inhibited phosphodiesterase (Pde3a), mRNA 


jRC1-HT0375-030500-015K;03 HT0375 Homo sapiens cDNA 


j Haemophilus influenzae Rd section 16 of 163 of the complete genome 


j Homo sapiens chromosome 21 segment HS21C067 


j Homo sapiens chromosome 21 segment HS21 C067 


(Rattus norvegicus cenexin 2 mRNA, partial cds 


| Homo sapiens low density lipoprotein receptor-related protein II (LRP2) gene, exon 1 and partia 


] THYMIDYLATE KINASE (DTMP KINASE) 


jPM1-DT0041-190100-002-h03 DT0041 Homo sapiens cDNA 


jMus spretus strain SPRET/Ei CD48 antigen (Cd48) gene, partia! cds 


|Oryctolagus cuniculus alpha 1 anti-trypsin (alpha 1 AT) gene, promoter region 


J Aspergillus oryzae pyrG gene for orotidine-5'-phosphate decarboxylase, complete cds 


jMICROTUBULE-ASSOCiATED PROTEIN 1A [CONTAINS: MAPI LIGHT CHAIN LC2] 


jSIM1 PROTEIN 
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]AF064626.1 


|L42320.1 






|P40472 
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(Top) Hit 
BLAST E 
Value 


6.1E-01| 


6.0E-01 1 
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6.0E-01 1 


6.0E-01 1 


6.0E-01 1 


6.0E-01 [ 


s 


6.0E-01 1 
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6.OE-01 


6.0E-01 1 
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. 27727 | 




28541 1 


28542 | 


28939 I 
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22954 | 






1 2591 7I 
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| 28454 | 


§ 
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I 25342! 






| 21 637 1 


Exon 
SEQ ID 
NO: 


1 176281 


I 104291 


1 


| 11245| 


1 
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1 


I 162131 






! 17503 | 
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§ 








109071 


i 13156| 


1 131561 


a 
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| 18205| 


| 18401 1 








| 11763[ 
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SEQ ID 
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1 7778 
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1 3744 


1 5227 
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1 6725 1 
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I 8838 I 
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Top Hit Descriptor 


|601 852474F1 NIH_MGC_56 Homo sapiens cDNA clone IMAGE:4076131 5' j 


| Vigna radiata mRNA for proton pyrophosphatase, complete cds [ 


jMegaselia scalaris sex-lethal homolog (Megsxi) gene, partial cds, alternatively spliced products j 


|HUM500E06B Human placenta polyA+ (TFujiwara) Homo sapiens cDNA clone GEN-500E06 5' 1 


[cyclic AMP-regulated phosphoprotein [rats, mRNA, 1030 nt] i 


Iyn91b03.s1 Scares adult brain N2b5HB55Y Homo sapiens cDNA clone IMAGE:175757 3' similar to j 
|gb:S78187 M-PHASE INDUCER PHOSPHATASE 2 (HUMAN); [ 


jSPORE COAT PROTEIN SP9S | 


|SPORE COAT PROTEIN SP96 j 


| Homo sapiens partial TCF-4 gene for T-cell transcription factor-4, exons 6-1 1 j 


|Homo sapiens partial 5-HT4 receptor gene, exons 2 to 5 | 


|602127577F1 NIH_MGC_56 Homo sapiens cDNA clone IMAGE:4284403 5' | 


602127577F1 NlH_MGC_56 Homo sapiens cDNA clone IMAGE:4284403 5' I 


| PUTATIVE TRANSCRIPTION FACTOR OVO-LIKE 1 (MOVOI) (MOVOIA) | 


j Populus euramericana peacs-2 mRNA for 1-aminocyclopropane-1-carboxyiate synthase, complete cds | 


|Homo sapiens T cell receptor beta chain (BV6S7*2-BJ1S1) mRNA, partial cds ! 


|Rattus norvegicus cAMP-reguIated guanine nucleotide exchange factor II (cAMP-GEFII) mRNA, partial cds [ 


|601454952F1 NIH_MGC_66 Homo sapiens cDNA clone IMAGE:3858590 5' | 


jBotrytis cinerea strain T4 cDNA library under conditions of nitrogen deprivation j 


jPYRROLINE-5-CARBOXYLATE REDUCTASE (P5CR) (P5C REDUCTASE) | 


j MR3-HT0736-1 80700-003-a02 HT0736 Homo sapiens cDNA i 


| Homo sapiens mRNA for KIAA0740 protein, partial cds [ 


|Homo sapiens mRNA for KIAA0740 protein, partial cds I 


| Arabidopsis thaliana DNA chromosome 4, contig fragment No. 13 j 


? 
1 
| 

I 


j AV684703 GKC Homo sapiens cDNA clone GKCFSF05 5' | 


j AV684703 GKC Homo sapiens cDNA clone GKCFSF05 5' I 


|601514007F1 NIH_MGC_71 Homo sapiens cDNA clone IMAGE:3915457 5' . | 


ng75g1 0.s1 NCI_CGAP_Pr6 Homo sapiens cDNA clone IMAGE:940674 similar to contains element PTR7 I 
repetitive element ; ) 


jHIGH AFFINITY POTASSIUM TRANSPORTER | 


|602132029F1 N1H_MGC_81 Homo sapiens cDNA clone IMAGE:4271334 5' | 
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Top Hit Descriptor 


Rattus norvegicus Propionyl Coenzyme A carboxylase, beta polypeptide (Pccb), mRNA | 


GAG POLYPROTEIN [CONTAINS: INNER COAT PROTEIN P1 2; CORE PROTEIN P1 5; CORE SHELL I 
PROTEIN P30; NUCLEOPROTEIN P10] \ 


GAG POLYPROTEIN [CONTAINS: INNER COAT PROTEIN P12; CORE PROTEIN P15; CORE SHELL I 
PROTEIN P30; NUCLEOPROTEIN P10] | 


Homo sapiens superkiller viralicidic activity 2 (S. cerevisiae homolog)-like (SK1V2L), mRNA j 


yo18a1 0.s1 Soares adult brain N2b5HB55Y Homo sapiens cDNA clone IMAGE:178266 3' j 


Rabbit oral papillomavirus, complete genome j 


FOS-RELATED ANTIGEN-1 i 


Homo sapiens KIAA0929 protein Msx2 interacting nuclear target (MINT) homolog (KIAA0929), mRNA | 


Homo sapiens KIAA0929 protein Msx2 interacting nuclear target (MINT) homolog (KIAA0929), mRNA | 


Pseudomonas syringae pv. tomato strain DC3000 AvrE (avrE), HrpW (hrpW), and GstA (gstA) genes, j 
complete cds; and unknown genes | 


Pseudomonas syringae pv. tomato strain DC3000 AvrE (avrE), HrpW (hrpW), and GstA (gstA) genes, j 
icomplete cds; and unknown genes | 


QV4-NN0040-070400-1 S0-c04 NN0040 Homo sapiens cDNA ) 


Chlamydophila pneumoniae AR39, section 74 of 94 of the complete genome j 


D-osophila melanogaster mRNA for 1 5,1 5' beta carotene dioxygenase (beta-diox gene) I 


602075545F1 NIH_MGC_62 Homo sapiens cDNA clone IMAGE:4243690 5' i 


NITRATE REDUCTASE [NADPH] (NR) | 


LAMININ ALPHA-2 CHAIN PRECURSOR (LAMININ M CHAIN) (MEROSIN HEAVY CHAIN) | 


LAMININ ALPHA-2 CHAIN PRECURSOR (LAMININ M CHAIN) (MEROSIN HEAVY CHAIN) | 


wl37g04.x1 NCI_CGAP_Ut1 Homo sapiens cDNA clone IMAGE:242?126 3' similar to gb:M13452 LAMIN A I 
I (HUMAN); j 


Homo sapiens HLA class III region containing tenascin X (tenasoin-X) gene, partial cds; cytochrome P450 21- 
hydroxylase (CYP21 B), complement component C4 (C4B) G1 1 , helicase (SKI2W), RD, complement factor B 
■(Bf), and complement component C2 (C2) genes,> 


Brassica oleracea var. capitata phospholipase D2 (PLD2) gene, complete cds j 


Brassica oleracea var. capitata phospholipase D2 (PLD2) gene, complete cds j 


Homo sapiens protein tyrosine phosphatase, receptor-type, zeta polypeptide 1 (PTPRZ1 ) mRNA | 


Homo sapiens protein tyrosine phosphatase, receptor-type, zeta polypeptide 1 (PTPRZ1) mRNA | 


Homo sapiens secreted C-type lectin precursor (LSLCL) gene, complete cds i 
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Top Hit Descriptor 


| Mycoplasma genitalium section 9 of 51 of the complete genome | 


jzu42M2.y5 Soares ovary tumor NbHOT Homo sapiens cDNA clone IMAGE:740711 5' | 


zu42h1 2 y5 Soares ovary tumor NbHOT Homo sapiens cDNA clone IMAGE: 74071 1 5' | 


|7e73c12.x1 NCI_CGAP_Pr28 Homo sapiens cDNA clone IMAGE:32881 1 8 3' similar to gb:J02783 I 
|pROTEIN DISULFIDE ISOMERASE PRECURSOR (HUMAN); | 


7e73c1 2.x1 NCI CGAP Pr28 Homo sapiens cDNA clone IMAGE:32881 1 8 3' similar to gb:J02783 I 
PROTEIN DISULFIDE ISOMERASE PRECURSOR (HUMAN); | 


1 Roridula gorgonias ribulose 1 ,5-bisphosphate carboxylase (rbcL) gene, partial cds; chloroplast gene for [ 
|chloroplast product j 


I6013398S7F1 NIH MGC 53 Homo sapiens cDNA clone IMAGE:36821 68 5' j 


Og30e05.s1 NCI_CGAP_Br7 Homo sapiens cDNA clone IMAGE1441376 3" similarto gb:J02611 I 
iAPOLIPOPROTEIN D PRECURSOR (HUMAN); | 


j Drosophila melanogaster helix-loop-helix mRNA, complete cds } 


NUCLEAR FACTOR OF ACTIVATED T CELLS 5 (T CELL TRANSCRIPTION FACTOR NFAT5) (NF-AT5)| 
!(REL DOMAIN-CONTAINING TRANSCRIPTION FACTOR NFAT5) j 


jHomo sapiens phospholipid scramblase 1 gene, complete cds \ 


jHomo sapiens chromosome 21 segment HS21C085 I 


j Homo sapiens mRNA for KIAA0740 protein, partial cds j 


Chlamydophila abortus strain S26/3 POMP91 A and POMP90A precursor, genes, complete cds j 


[Azotobacter vinelandii icd gene for isocitrate dehydrogenase, complete cds j 


| Botrytis cinerea strain T4 cDNA library under conditions of nitrogen deprivation \ 


!am77g05.s1 Stratagene schizo brain S1 1 Homo sapiens cDNA done IMAGE:16165D4 3' j 


Msdicago sativa chloroplast malate dehydrogenase precursor (plmdh) mRNA, nuclear gene encoding | 
chloroplast protein, complete cds j 


Homo sapiens chromosome Xq28 melanoma antigen family A2a (MAGEA2A), melanoma antigen family A12 
(MAGEA12), melanoma antigen family A2b (MAGEA2B), melanoma antigen family A3 (MAGEA3), caltractin 
(CALT), NAD(P)H dehydrogenase-like protein (NSDHL), and Ll> 


jMus musculus vanilloid receptor-like protein 1 (Vrh), mRNA j 


[Homo sapiens PELOTA (PELOTA) gene, complete cds j 


:UI-H-BI1-acp-a-08-0-Ul.s1 NCl_CGAP_Sub3 Homosapiens cDNA clone IMAGE:2714799 3' | 


! retinoic acid receptor gamma (rar-gamma) (retinoic acid receptor delta) (rar- 1 
|delta) 1 


Human adrenodoxin reductase gene, exons 3 to 12 j 


Polyangium vitellinurn (strain PI vt1) 16S rRNA gene | 
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| 20347| 


1 


SEQ ID 
NO: 


1 






1 






1 












I 


1 










I 






1 


8 


§ 




Probe 
SEQ ID 
NO: 








I 


1 




1 


1 








1 


1 


1 


1 


I 


3 


I 


1 


i 


1 




1 


1 
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Top Hit Descriptor 


Polyangium vitellinum (strain PI vt1 ) 16S rRNA gene j 


R.norvegicus mRNA for mammalian fusca protein ) 


602139319F1 NIH_MGC_46 Homo sapiens cDNA clone IMAGE42981 1 7 5' | 


1 

| 

1 
< 

1 
5. 

| 

8 

z 

1 


TRANSCRIPTION-REPAIR COUPLING FACTOR (TRCF) | 


yi94a09.s1 Soares placenta Nb2HP Homo sapiens oDNA clone IMAGE:146872 3' | 


Human regenerating protein (reg) gene, complete cds j 


65B1 Human retina cDNA Tsp509l-cleaved sublibrary Homo sapiens cDNA not directional | 


1 
f. 

i 
a 

Q 

| 

O 
1 

z 

§ 


nac51f10.x1 NCI_CGAP_Brn23 Homo sapiens cDNA clone IMAGE:3406218 3' similar to contains element I 
TAR1 repetitive element; j 


Home- sapiens postmeiotic segregation increased 2-like 9 (PMS2L9), mRNA | 


Homo sapiens postmeiotic segregation increased 2-like 9 (PMS2L9), mRNA j 


Buchnera aphidicola genomic fragment containing (chaperons Hsp60) groEL, DNA biosynthesis initiating 
protein (dnaA), ATP operon (atpCDGAHFEB), and putative chromosome replication protein (gidA) genes, 
complete cds; and termination factor Rho (rho) gene> 


Buchnera aphidicola genomic fragment containing (chaperone HspOO) groEL, DNA biosynthesis initiating 
protein (dnaA), ATP operon (atpCDGAHFEB), and putative chromosome replication protein (gidA) genes, 
complete cds; and termination factor Rho (rho) gene> 


Mus musculus anti-DNA immunoglobulin light chain IgM mRNA, antibody 363p. 138, partial cds j 


Homo sapiens mRNA fo- K1AA1184 protein, partial cds [ 


Xencpus laevis smooth muscle beta-tropomyosin mRNA, complete cds I 


:601903871F1 NIH_MGCJ9 Homo sapiens cDNA clone IMAGE:4136632 5' | 


; GLYCOGEN DEBRANCHING ENZYME (GLYCOGEN DEBRANCHER) [INCLUDES: 4-ALPHA- 
GLUCANOTRANSFERASE(OLIGO-1,4-1 ,4-GLUCANTRANSFERASE); AMYLO-1 ,5-GLUGOSIDASE 
(DEXTRIN 6-ALPHA-D-GLUCOSIDASE)] 


GLYCOGEN DEBRANCHING ENZYME (GLYCOGEN DEBRANCHER) [INCLUDES: 4-ALPHA- 
GLUCANOTRANSFERASE (OLIGO-1 ,4-1 ,4-GLUCANTRANSFERASE); AMYLO-1 ,6-GLUCOSIDASE 
(DEXTRIN 6-ALPHA-D-GLUCOSIDASE)] 


601445024F1 NIH_MGC_65 Homo sapiens cDNA clone IMAGE:3849436 5' j 


Oryzias latipes gene for membrane guanylyl cyclase OIGC1, complete cds 


Mus musculus MRC OX-2 antigen homolog gene, exons 2-5, and complete cds j 


Homo sapiens chromosome 21 segment HS21C102 i 


NUCLEAR ENVELOPE PROTEIN CUT1 1 | 


Top Hit 
Database 


! IN | 


!nt 1 


lEST HUMAN | 


jEST HUMAN | 


j SWISSPROT | 


Jest human | 


1 in: 


z 

=> 


]est human 1 


I 

I 




INT | 


NT 


NT 


z 


!NT I 


INT | 


jEST HUMAN | 


SWISSPROT 


SWISSPROT 


iEST HUMAN | 


Int I 




INT I 


iSWISSPROT | 


Top Hit Acession 
No. 


|AJ233944.1 | 


1 
5? 


;. 


1 
< 


§ 

s 


SR80873.1 - | 


J05412.1 j 




BF030207.1 j 


!bF439932.1 


I 


1 


AF008210.1 


AF008210.1 


! U55574.1 j 


AB033010.1 | 


IM92304.1 i 


IBF317212.1 i 


iP35573 


P35573 




AB021 490.2 j 


AF029215.1 i 


1AL163302.2 \ 


.013961 i 


Most Similar 
(Top) Hit 
BLAST E 
Value 


1E-01 


1E-01 


I E-01 


IE-01 


IE-01 


IE-01 


IE-01 


s 


IE-01 


s 


; 5.0E-01 1 


1 


5.0E-01 


5.0E-01 


! 5.0E-01| 


5.0E-01 1 


5.0E-01 [ 


! 5.0E-01 1 


i 


5.0E-01 


I 5.0E-01 1 


S 5.0E-01 1 


| 5.0E-01 1 


\ 5.0E-01 1 


! 5.0E-01 1 




















& 

I 1 


| 3.2B 


I 1.06, 




I 3.84| 


1 




! 4.54| 




S 3.62! 


2.03; 




1 L37| 




1.32 


I 0.84| 




5.62| 


| 3.61 1 




2.25 






| 2.26! 


! 2.02 




ORFSEQ 
ID NO; 


| 20382 






| 23667 






I 27650 | 


| 27653 [ 






! 21871 | 


| 21872| 


21882 


i 


| 23388 1 


23495 | 




26644 | 




27615 












SEQ ID 
NO: 


1 


! 11536| 






t 




17435 1 












i 


1 


13601 | 


13709 | 




16455 1 




17401 


I 17791 1 






| 19349; 


1 


Probe 
SEQ ID 
NO: 


s 


1 


| 1977| 


1 


i 


I 61331 


I 7584 I 


1 


I 9230 1 


| 9473 1 


I 2087 | 


| 2087 | 


2097 


2097 


1 


| 3797 | 


I 


I 7442I 


7550 


7550 


I 


I 9003 | 


I 91701 


S 

8 


I 
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Top Hit Descriptor 


jZINC FINGER X-CHROMOSOMAL PROTEIN | 


| qo39f09.x1 NCl_CGAP_Lu5 Homo sapiens cDNA clone IMAGE:1 91 0921 3' [ 


j GLYCOPROTEIN B PRECURSOR (GLYCOPROTEIN 14) j 


JTYROSINE-PROTEIN KINASE RECEPTOR TIE-1 PRECURSOR j 


I beta -HKA=H,K-ATPase beta-subunit [rats, Genomic, 8983 nt, segment 2 of 2] j 


| beta -HKA=H,K-ATPase beta-subunit [rats, Genomic, 8983 nt, segment 2 of 2] | 


jMus musculus sodium channel, type X, alpha polypeptide (ScnlOa), mRNA , | 


] Homo sapiens chromosome 21 segment HS21 C082 j 


lAutographa californica nucleopolyhedrovirus, complete genome ] 


|UV EXCISION REPAIR PROTEIN PROTEIN RAD23 HOMOLOG A (HHR23A) j 


| Callithrix jacchus MW/LW opsin gene, upstream flanking region I 


Callithrix jacchus MW/LW opsin gene, upstream flanking region | 


I QV4-SN0024-200400-1 83-b01 SN0024 Homo sapiens cDNA j 


] CM2-DT0003-01 0200-077-C01 DT0003 Homo sapiens cDNA ] 


I MRO-BN0070-270300-008-g04 BN0070 Homo sapiens cDNA | 


Aquifex aeolicus section 30 of 109 of the complete genome 


iHuman somatostatin 1 gene and flanks j 


j Callithrixjacohus MW/LW opsin gene, upstream flanking region | 


J Callithrix jacchus MW/LW opsin gene, upstream flanking region | 


Arabidopsis thaliana DNA chromosome 4, contig fragment No. 14 | 


Xestia c-nigrum granulovirus, complete genome | 


iSaimiri sciureus olfactory receptor (SSC186) gene, partial cds ' | 


1 Coturnix coturnix japonica ifnG gene | 


DNAGYRASE SUBUNITB j 


602023 1 34F1 NCI_CGAP_Brn67 Homo sapiens cDNA clone 1MAGE:41 58296 5' ( 


Methanococcus valtae flagella-related protein C-l (flaC-flal) genes, complete cds j 


• hh74e1 0.yl NCI CGAP_GU1 Homo sapiens cDNA clone IMAGE:2968554 5' | 


hh74e1 0.yl NCI_CGAP_GU1 Homo sapiens cDN A cione IMAGE:29S8554 5' | 


xn63e05.x1 Soares_NHCeC_cervical_tumor Homo sapiens cDNA clone IMAGE:2698400 3' similar to 
:-R:O00189 000189 ML-ADAPTIN-RELATED PROTEIN 2. ; 


I Equus caballus microsatellite LEX027 | 


RC3-BN0034-290200-01 3-d 2 BN0034 Homo sapiens cDNA | 


RC3-BN0034-290200-013-C12 BN0034 Homosapiens cDNA | 


iStreptomyces coelicolor whiH gene j 




ll 


ISWISSPROT | 


| EST HUMAN | 


| SWISSPROT i 


ISWISSPROT | 


z 


z 


z 




z 


ISWISSPROT 1 


z 




| 


I EST HUMAN | 


|EST HUMAN | 






z 


z 






z 


z 


ISWISSPROT | 


lESTHUMAN | 


z 


| EST HUMAN | 


jEST HUMAN | 


|est human I 




| EST HUMAN | 


| EST HUMAN | 


z 




Z 


O 


< 


I 


|P35590 


|S76404.1 


|S76404.1 




| AL1 63282.2 


1 


|P54725 


|AF155218.1 


1 




1 
1 


|AW999477.1 


|AE000598.1 J 


|J00306.1 


I 
1 


|AF155218.1 


|AL161502.2 [ 


| 9635250 1 


| AF1 79825.1 


|AJ001678.1 | 


|033367 ! 


|BF348001.1 i 


|U97040.1 


8 

S 

s 

< 


|AW630048.1 | 


| AW 170559.1 | 


|AF075629.1 ! 


1 


|AW993658.1 | 


|AJ003O22.1 


Most Similar 
(Ton) Hit 


BLAST E 
Value 


i 


§ 


i 


i 


1 


i 


4.4E-01 


i 


9 


i 




3 


s 






s 


s 


s 


i 


4.3E-01 


4.3E-01 


s 


i 


s 


3 


s 


1 




S 


i 


4.3E-01 


4.3E-01 


4.3E-01 


1 


I 

m 














9 




1 






s 




















I 


1 
















s 


s 




l ORFSEQ 


6 
z 

Q 


j 27422 j 


| 27798 | 




| 27903 | 


| 28025 [ 


1 


j 25329| 










201 79 1 


21 350 1 








23737 | 


20178| 








25592 | 


3 
1 












27967 | 


8 
8 


28673 1 


1 




c 


□ 


17222! 


1 7573 




1 




17785 








§ 












































1 


1 




































































1 Probe 


p 


| 7354 


| 7723 


| 7724 


| 7813 


I 


| 7935 


1 


1 


1 


| 9824 1 






| 1585| 




1 


1 


§ 

9 


| 4306| 


| 4306| 


| 4875| 






a 

8 


| 6102| 


1 


§ 


1 


1 


| 7872 


| 8302| 


| 8537 I 


| 8537 


1 
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Top Hit Descriptor 


CELL DIVISION PROTEIN FTSH HOMOLOG PRECURSOR j 


i 

iTi 

CD 
1 

1 
< 

z 

D 

i 
f 
1 

o 

CD 
O 

z 

1 


Xylella fastidiosa, section 93 of 229 of the complete genome | 


1 
I 

| 
I 

Q 

I 
1 

1 
I 

£ 

1 

§ 
3 


QV0-LT001 5-1 80200-1 27-h01 LT0015 Homo sapiens cDNA j 


O 
EC 


RC5-BT0559-020300-01 3-E06 BT0559 Homo sapiens cDNA j 


nj59h01.s1 NCI_CGAP_Pr10 Homo sapiens cDNA clone IMAGE:997777 similar to gb:M33600 HLA CLASS I 
II HISTOCOMPATIBILITY ANTIGEN, DR-1 BETA CHAIN (HUMAN); \ 


yf77e01.r1 Scares infant brain 1 NIB Homo sapiens cDNA clone IMAGb:28278 5' j 


601879721F1 NIH_MGC_55 Homo sapiens cDNA clone IMAGE:4108493 5' j 


RC3-CT0254-060400-029-g04 CT0254 Hdiiio sapiens cDNA I 


AU1 58472 PLACE2 Homo sapiens cDNA dene PLACE2000470 3' | 


AU1 58472 PLACE2 Homo sapiens cDNA done PLACE2000470 3' j 


Brca1=breast cancer gene [rats, WF, spleen, Genomic, 419nt, segment2of 2] | 


Arsbidopsis thaliana DNA chromosome 4, contig fragment No. 47 j 


EST36941 3 MAGE resequences, MAGE Homo sapiens cDNA ) 


EST36941 3 MAGE resequences, MAGE Homo sapiens cDNA 


Homo sapiens cytochrome c oxidase subunit Vic (COX6C), nuclear gene encoding mitochondrial protein, 1 
mRNA 1 


MR3-SN0010-280300-103-h07 SN0010 Homo sapiens cDNA | 


Oryzias latipes OIGC7 mRNA for membrane guanylyl cyclase, complete cds j 


601660352R1 NIH_MGC_71 Homo sapiens cDNA clone IMAGE:3906085 3' ] 


AV731 81 5 HTF Homo sapiens cDNA clone HTFBHH05 5' | 


RC-BT091-210199-142BT091 Homo sapiens cDNA | 


AV705243 ADB Homo sapiens cDNA clone ADBAHF08 5' j 


AV705243 ADB Homo sapiens cDNA clone ADBAHF08 5' | 


PM-BT1 03-270499-684 BT1 03 Homo sapiens cDNA j 


Homo sapiens anaphase-promoting complex subunit 7 (APC7), mRNA ( 


Arabidopsis thaliana DNA chromosome 4, contig fragment No. 36 j 


Arsbidopsis thaliana DNA chromosome 4, contig fragment No. 36 j 


Oj94b08.s1 Soares_NFL_T_GBC_S1 Homo sapiens cDNA clone IMAGE:1505943 3' ) 


Rnodococcus sp. AD45 isoG, isoH, isol, isoJ, isoA, isoB, isoC, isoD, isoE and isoF genes | 


om33d02.s1 Soares_NFL_T_GBC_S1 Homo sapiens cDNA clone IMAGE.i 542819 3' | 


Top Hit 
Database 


SWISSPROT | 


i 


NT I 


EST HUMAN j 


EST HUMAN [ 


SWISSPROT [ 


EST HUMAN | 


|esT HUMAN | 


EST HUMAN | 




EST HUMAN | 


EST HUMAN | 


EST HUMAN | 


I 


NT I 


EST HUMAN [ 


EST HUMAN | 


|nT ! 


EST_HUMAN | 


NT J 


EST HUMAN | 


EST HUMAN | 


ESTJHUMAN | 


EST HUMAN | 


i 
ffi 


EST HUMAN | 


NT | 


NT | 


z 


EST_HUMAN | 


z 


EST HUMAN | 


Top HitAcession 
No. 


Q39102 


AA761653.1 


AE003947.1 


AI280338.1 | 


AW835527.1 




BE073574.1 I 


1 

1 


R13467.1 ! 


I 


AW854162.1 | 


AU158472.1 | 


AU 158472.1 


S82504.1 | 


AL1 61 547.2 I 


AW957448.1 | 


AW957448.1 | 


| 4758039| 


8 
1 
1 


1 
§ 


1 

: : 




AI905481.1 


AV705243.1 i 


AV705243.1 




7705283 | 


AL1 61 536.2 i 


AL161 536.2 | 


i 
1 


AJ249207.1 


i 


Most Similar 
(Top) Hit 
BLAST E 
Value 


4.2E-01 


4.2E-01 


4.2E-01 


4.2E-01 


4.2E-01 


4.2E-01 


4.2E-01 


i 


4.2E-01 


4.2E-01 


§ 


4.2E-01 


4.2E-01 


4.2E-01 


4.2E-01 


4.2E-01 


4.2E-01 


1 4.2E-01 


4.2E-01 


4.2E-01 


4.2E-01 


4.2E-01 


E-01 


E-01 


<? 


2 




E-01 


s 


E-01 


E-01 


E-01 






















S 

o "to 

8 & 

&* 


1.04, 


0.89| 


4.78! 


1.09| 


0.96f 
































2.04I 


! 


s 






8 




1.76! 




1.13| 


2.59j 




ORFSEQ 
ID NO: 


21099| 




$ 




23503 | 


23595| 








25484| 




s 

8 


§ 


2621 9 1 




26739| 


? 

8 


| 
1 


8 


28528I 


I 




1 


3 


i 


1 


22434| 


22633 | 


1 




23852| 




SEQ ID 
NO: 


1 


1800 




1 


3715| 




! 


1 




5422! 


s 
s 


8 


6024 1 


1 


1 


6543 j 


1 


1 




8276 


8517 






§ 




1496! 


25431 






3186' 


4077 




































































Probe 
SEQ ID 
NO: 


1 1335| 


1804| 




35851 






1 




46711 




1 


6151 1 


61511 


6184| 


62171 


1 


I 




1 




1 


1 


10781 


1087] 


g 


15921 


2678| 


2910 1 




3263| 


41771 


4208| 
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Top Hit Descriptor 


yg1 1 b03.s1 Soares infant brain 1NIB Homo sapiens cDNA clone IMAGE:31814 3' | 


AV747880 NPC Homo sapiens cDNA clone NPCBDF10 5' j 


Bacillus subtilis complete genome (section 21 of 21 ): from 3999281 to 421 481 4 j 


B02155590F1 NIH_MGC_83 Homo sapiens cDNA clone IMAGE:4297319 5' | 


Methanococcus jannaschii section 77 of 1 50 of the complete genome | 


602133261 F1 NIH_MGC_81 Homo sapiens cDNA clone IMAGE:4288238 5' | 


|mus musculus signaling intermediate in Toll pathway-evolutionarily conserved (Sitpec-pending), mRNA | 


Campylobacter jejuni NCTC1 1 1 68 complete genome; segment 3/6 j 


CM2-HT0 1 37-200999-0 1 0-e08 HT01 37 Homo sapiens cDNA j 


Zea mays ZMPMS2 gensfor 19 kDa zein protein | 


VOLTAGE-GATED POTASSIUM CHANNEL PROTEIN KV1.1 (HUKI) (HBK1) ] 


Homo sapiens DNA for amyloid precursor protein, complete cds ] 


Laqueus rubellus mitochondrion, complete genome | 


Drosophila melanogaster Dalmatian (dmt) mRNA, complete cds j 


Mus musculus platelet derived growth factor receptor, beta polypeptide (Pdgfrb), mRNA ] 


Ascobolus immersus masc2 gene j 


Ascobolus immersus masc2 gene | 


Mus musculus ubiquitin-protein ligase e3 componen n-recognin (Ubr1), mRNA | 


Homo sapiens chromosome 21 segment HS21C080 | 


Homo sapiens chromosome 21 segment HS21 C080 t 


phosphate-UDP-MurNAc-pentapeptide phospho-MurNAc-pentapeptide transferase (mraY) genes, complete 


Ovis aries partial JD2 gene for T cell receptor delta chain (TCRDJ2), exon 1 j 


Ovis aries partial JD2 gene for T cell receptor delta chain (TCRDJ2), exon 1 j 


NADH-PLASTOQUINONE OXIDOREDUCTASE CHAIN 5, CHLOROPLAST | 


EST382691 MAGE resequences, MAGK Homo sapiens cDNA j 


Synechocystis sp. PCC 941 3 transposase gene, complete cds | 


Homo sapiens chromosome 21 segment HS21C1 00 j 


HYPOTHETICAL 49.7 KD PROTEIN IN GIN2-STE3 INTERGENIC REGION | 


Campylobacter jejuni NCTC11168 complete genome; segment2/6 | 


CV4-HT0136-15O999-O14-f09 HT0136 Homo sapiens cDNA { 


Gorilla gorilla carboxyl-ester lipase (CEL) gene, complete cds | 


Top Hit 


EST HUMAN I 


i 

a 


z 


EST HUMAN [ 


NT I 


EST HUMAN 


NT | 


z 


EST HUMAN j 


NT ! 


SWISSPROT i 


NT 


NT | 


NT j 


NT j 


NT 


NT i 


NT j 


NT | 


NT 


NT 


z 


NT 


SWISSPROT 


EST HUMAN 


NT 


z 


SWISSPROT ; 


z 


EST HUMAN 


NT 


Top Hit Acession 
No. 


i 

rr 


IAV747880.1 \ 


Z99124.1 i 


1 


U67535.1 | 


|BF574604.1 | 


1 




IBF349382.1 \ 


1X58700.1 i 


| 


|D87675.1 I 


s 
s 

1 




s 
1 


1 


1 


5 
S 
S§ 


| AL1 63280.2 ! 


AL1 63280.2 i 


< 


|AJ277511.1 I 


1AJ277511.1 | 


|Q31849 




§ 

s 


| AL1 63300.2 




(AL139075.2 I 


|AW352188.1 


|AF206618.1 


Most Similar 
(Top) Hit 
BLAST E 
Value 


E-01 


E-01 


E-01 


s 


E-01 


E-01 




3 


E-01 


E-01 


E-01 


E-01 


4.0E-01 


9 


4.0E-01 


4.0E-01 


4.0E-01 


4.0E-01 


4.0E-01 


3 


3 


4.0E-01 


4.0E-01 




4.0E-01 


4.0E-01 


4.0E-01 


3 


4.0E-01 


3.9E-01 


1 


























Expression 


o 




0.87| 




2.911 




I.23I 






s 




1.721 


1.27| 


d 


4.0BJ 


1.17| 






2.24| 










7.34I 


1.311 


3.24| 


2.22| 






1.59| 




ORF SEQ 
ID NO: 




24247| 




8 




26757I 


27323 I 






1 


28121 | 




. 20783 | 


210791 




21 742 | 




1 


22665 I 


I 




1 


23452I 




1 










20007| 


2-1119j 


SEQ ID 
NO: 




i 






6257 










1 




9702 


1 




1374 


2709 


§ 






1 


s 


1 




1 


551 7 1 
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9327 1 
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1 
































































Probe 
SEQ ID 
NO: 


43731 


1 


5185J 


! 


5 
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1 


1 


I 


1 


82131 
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i 




1 
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1 


27741 


2939! 
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37551 


3755I 


47121 


I 




9312: 




1 




1356| 
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Top Hit Descriptor 


| Homo sapiens mRNA for KIAA1 193 protein, partial cds j 


| H , sapiens B-myb gene | 


| H . sapiens B-myb gene ( 


% 
| 

! 
1 

i 

E 

! 


|7iS1d01.x1 NCI_CGAP_Br16 Homo sapiens cDNA clone IMAGE:3339169 3' | 


|601563948F1 NIH_MGC_20 Homo sapiens cDNA clone IMAGE:3833699 5' | 


|6018623S2F1 NIH_MGC_53 Homo sapiens cDNA clone IMAGE:4082055 5' | 


xn86d04.x1 Soares_NFL_T_GBC_S1 Homo sapiens cDNA clone IMAGE:2701351 3' similar to TR:094821 j 
1 094821 KIAA071 3 PROTEIN ; | 


Iwp76a02.x1 NCI_CGAP_Brn25 Homo sapiens cDNA clone IMAGE:2467658 3' similar to I 
|SW:RFX5_HUMAN P48382 BINDING REGULATORY FACTOR. ; | 


j Human clabindin 27 gene, exons 10 and 1 1 , and L1 and Alu repeats j 


j AV695974 GKC Homo sapiens cDNA clone GKCBQC1 1 5' j 


| Homo sapiens proteoglycan 3 (PRG3) gene, complete cds ] 


| HOMEOBOX PROTEIN HLX1 [ 


| Thermotoga maritima section 1 23 of 1 36 of the complete genome ] 


j Homo sapiens protein kinase PKNbeta (pknbeta), mRNA j 


|Xylel!a fastidiosa, section 1 6 of 229 of the complete genome j 


| Arabidopsis thaliana putative c-myb-like transcription factor (MYB3R-3) mRNA, complete cds | 


Mus musculus solute carrier family 1, member 6 (SIc1a6), mRNA [ 


|Human immunodeficiency virus type 1 complete genome (isolate 98SE-MP1 21 3) \ 


| Pleuronectes americanus aminopeptidase N (ampN) gene, partial cds 1 


|Arabidopsis thaliana DNA chromosome 4, contig fragment No. 30 | 


i 

1 

1 
1 

Q 

I 

1 
X 

o 

I 
x 


| wf38b1 2.x1 Soares_NFL_T_GBC_S1 Homo sapiens cDNA clone IMAGE:2357855 3 1 | 


|Mus musculus general transcription factor II 1 (Gtf2i), mRNA j 


|601074110F1 NIH_MGC_12 Homo sapiens cDNA clone IMAGE:3460154 5' | 


| TRANSCRIPTION FACTOR SOX-10 | 


| QV3-BT0537-271 299-049-e02 BT0537 Homo sapiens cDNA ( 


Ita54f1 1 ,x1 Soares_total_fetus_Nb2HF8_9w Homo sapiens cDNA clone IMAGE:204791 7 3' similar to j 
| contains Alu repetitive element; | 


j M.muscuius gene for kallikrein-binding protein | 


| Homo sapiens mRNA for K1AA1631 protein, partial cds j 


Top Hit 


Database 


INT I 


INT 1 


z 




|EST HUMAN S 


jEST HUMAN 


| EST HUMAN 


1 


S 




5 

=3 
X 




jSWlSSPROT ! 


INT I 


:nt 1 


INT | 


Z 


NT | 


NT I 


NT 


!nt 1 


EST HUMAN j 


ESTJHUMAN 1 


INT j 


'EST HUMAN j 


SWISSPROT | 


|EST HUMAN i 


I 


z 


iNT ■ ! 


Top HitAcession 
No. 


IAB033019.1 I 


1X82032.1 i 


1 

a 


i 

3 


1 

E 


|BE728667.1 ! 


1 
m 


1 


|aI937337.1 


s 


1 


|AF304354.1 | 


|Q61670 I 


IAE001811.1 | 




! 


|AF214117.1 j 




s 

3 


|AF043383.1 \ 


S 

5 

S 


|AI807219.1 | 


3 
< 


1 






|BE072399.1 ! 


|aI374601.1 I 


|X61597.1 | 




Most Similar 
(Top) Hit 
BLAST E 
Value 


| 3.9E-01 


I 3.9E-01 




s 


j 3.9E-01 


| 3.9E-01 


j 3.9E-01 


3.9E-01 


3.9E-01 


3.9E-01 


3.9E-01 


j 3.9E-01 


I 3.9E-01 


3.8E-01 


3.8E-01 


i 3.8E-01 


I 3.8E-01 


3.8E-01 


3.8E-01 


| 3.8E-01 


! 3.8E-01 


3.8E-01 


3.8E-01 


I 3.8E-01 


1 3.8E-01 


3.8E-01 


j 3.8E-01 


3.8E-01 


3.8E-01 


j 3.8E-01 


Expression 
Signal 


I 3.54| 


1 


s 


3.63 1 






I 3.53| 






2.97| 


! 2.22 




! 2.38| 




I 7.44| 




? 




! 0.92| 




I 7.77\ 




I 0-91 1 


| 0.85| 


1 




I 4.47| 




I 4.83| 


I 3.07| 


ORF SEQ 
ID NO: 


| 22367| 


S5 


s 


j 22776 | 


1 


| 24572 | 


| 25611 | 


| 












25328 I 






i 22289| 


1 




| 22732 1 


! 23160| 






| 23545 1 






| 26023 | 






| 27052j 


SEQ ID 

NO: 
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1 
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1 
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SEQ ID 
NO: 




| 2673 [ 


R 
8 
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ft 
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| 561 3 | 
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| 7503 | 
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1 
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1 




| 1827| 
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| 3841 1 
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Top Hit Descriptor 


| DKFZp762K075_r1 782 (synonym: hmel2) Homo sapiens cDNA clone DKFZp762K075 5' i 


j Homo sapiens NF2 gene , | 


j Bos taurus partial stat5B gene, exons 2-15 and joined CDS \ 


| Human mibp gene, partial cds ! 


|yd03e05.r1 Soares infant brain 1 NIB Homo sapiens cDNA clone IMAGE:24443 5' i 


yd03e05.r1 Soares infant brain 1 NIB Homo sapiens cDNA clone IMAGE:24443 5' | 


| hg33f02.x1 NCI_CGAP_GC6 Homo sapiens cDNA clone IMAGE:2947419 3' j 


j hg33f02.x1 NCI_CGAP_GC6 Homo sapiens cDNA clone IMAGE:2947419 3' | 


|Mus musculus ribosomal protein S19 (Rps19) gene, complete cds | 


| Rattus norvegicus repeat element associated with the Rasgrfl gene | 


j Human mRNA for KIAA0323 gene, partial cds | 


] P.irregulare (P3804) gene for actin | 


jRC5-ST0171-181099-011-g07ST0171 Homo sapiens cDNA j 


PROTEIN-L-ISOASPARTATE O-METHYLTRANSFERASE (PROTEIN-BETA-ASPARTATE 
METHYLTRANSFERASE) (PIMT) (PROTEIN L-ISOASPARTYL METHYLTRANSFERASE) (L- 
ISOASPARTYL PROTEIN CARBOXYL METHYLTRANSFERASE) 


j Drosophila melanogaster sugar transporter 3 (sut3) mRNA, complete cds | 


|H. sapiens serotonin transporter gene, exons 9 and 10 | 


1 

i 

1 
I 
1 
I 

X 


| RC1 -HT0545-1 50600-01 4-b1 2 HT0545 Homo sapiens cDNA | 


Brassica napus mRNA for MAP4K alpha2 protein j 


| Bacteria from anoxic bulk soil 16S rRNA gene (strain XB45) [ 


j ha02g04.x1 NCI_CGAP Lu24 Homo sapiens cDNA clone IMAGE:2872566 3' > 


| Homo sapiens PHEXgeie ! 


yt74a06.r1 Soares fetal liver spleen 1NFLS Homo sapiens cDNA clone IMAGE:275987 5' | 


Wt72c10.x1 Soares_thymus_NHFTh Homo sapiens cDNA clone IMAGE:2513010 3' similar to TR:0151 17 I 
!oi5117 FYN BINDING PROTEIN. [1] ; | 


| Arabidopsis thaliana DNA chromosome 4, contig fragment No. 79 | 


| Homo sapiens lysosomal-associated membrane protein 2 (LAMP2), transcript variant LAMP2A, mRNA | 


j Homo sapiens lysosomal-associated membrane protein 2 (LAMP2), transcript variant LAMP2A, mRNA | 


Homo sapiens chromosome 21 segment HS21 C004 


I PROBABLE PEPTIDE ABC TRANSPORTER ATP-BINDING PROTEIN Y4TS I 


til 

Q 


EST HUMAN | 


z 






IEST HUMAN | 


EST HUMAN [ 


EST HUMAN | 


;EST HUMAN | 




Z 


Z 


z 


EST HUMAN j 


SWISSPROT 


Z 


z 


z 


EST HUMAN | 


z 


z 


EST HUMAN | 


z 


EST HUMAN [ 


EST HUMAN j 




z 


z 




SWISSPROT ] 


Top Hit Acession 


|AL1 21 154.1 I 


|Y1 8000.1 | 


|AJ237934.1 | 


|U89241.1 | 


|T80255.1 | 


JT80255.1 i 


|AW590184.1 | 


i 
I 


< 


|AF056927.1 ! 


|AB002321.1 i 


|X76725.1 | 


JAW812033.1 | 


a. 


1 




|X76758.1 i 


BE707883.1 | 


1 

3 


jAJ229237.1 j 


|AW339393.1 | 


1 
> 


|R94090.1 | 


|aW027174.1 I 


jAL1 61 583.2 j 


1 
8 


| 4504956 1 


|AL1 63204.2 [ 


10.53194 ! 


Most Similar 
(Top) Hit 
BLAST E 
Value 
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j 3.6E-01 1 
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Top Hit Descriptor 


602184016T1 NIH_MGC_42 Homo sapiens cDNA clone IMAGE:4300251 3' l 


Mus musculus disintegrin 5 (Dtgn5), mRNA I 


EST36722 Embryo, 8 week 1 Homo sapiens cDNA 5' end | 


Homo sapiens uridine monophosphate synthetase (orotate phosphoribosyl transferase and orotidine-5'- | 


< 

i 


Bacteriophage phi-Ye03-1 2 complete genome j 


INTERLEUKIN-12 ALPHA CHAIN PRECURSOR (IL-12A) (CYTOTOXIC LYMPHOCYTE MATURATION 1 
FACTOR 35 KD SUBUNIT) (CLMF P35) 


Streptomyces arglllaceus mithramycin biosyiithetic genes 1 


Homo sapiens MTA1 -L1 gene, complete cds i 


EXODEOXYRIBONUCLEASE V BETA CHAIN I 


Arabidopsis thaliana DNA chromosome 4, contig fragment No. 10 I 


Hypoxylon fragiforme chitin synthase gene, partial cds ] 


Rattus norvegicus DNA for regucalcin, partial cds' \ 


;tp78b1 2.x1 NCI_CGAP_Ut3 Homo sapiens cDNA clone IMAGE:2205407 3' similar to gb:X57522 ANTIGEN I 
PEPTIDt TRANSPORTER 1 (HUMAN); 


R.norvegicus mRNA for 3'UTR of ubiquitin-like protein j 


R.norvegicus mRNA for 3'UTR of ubiquitin-like protein 


j601472768T1 NIH_MGC_68 Homo sapiens cDNA clone IMAGE:3875753 3' ) 


I601472768T1 NIH_MGC_68 Homo sapiens cDNA clone IMAGE:3875753 3' 


tyS4h0 1 .x1 NCI_CGAP_Kid1 1 Homo sapiens cDNA clone IMAGE:2285809 3' similar to contains Alu j 
repetitive element;contains element L1 repetitive element ; I 


ty34h01.x1 NCI_CGAP_Kid11 Homo sapiens cDNA clone IMAGE:2285809 3' similar to contains Alu j 
j repetitive element;oontain3 element L1 repetitive element ; 


|j2498F Human fetal heart, Lambda ZAP Express Homo sapiens cDNA clone J2498 5' similar to TEGT I 


|602140372F1 NIK_MGC_46 Homo sapiens cDNA clone IMAGE:4301800 5' 


|zaS7h01 .s1 Scares fetal lung_NbHL19W Homo sapiens cDNA clone IMAGE:297649 3' | 


|RC4-TN0077-250800-O11-g04TN0077 Homo sapiens cDNA j 


| Homo sapiens high-mobility group phosphoprotein (HMGI-C) gene, exons 1-3, complete cds j 


I D.mauritiana Adh gene I 


| D.mauritiana Adh gene I 


|502070802F1 NCI_CGAP_Brn64 Homo sapiens cDNA clone IMAGE:4213585 5' I 


|hv51g02.x1 NCI_CGAP_Lu24 Homo sapiens cDNA clone IMAGE:3176978 3' | 


I 
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1 
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EST HUMAN 1 
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Z 
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INT | 
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|EST HUMAN | 


|EST HUMAN | 


|eST HUMAN | 


1 
r 


|eST HUMAN | 


|EST HUMAN 
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|EST HUMAN 
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|NT 


|NT 


|EST HUMAN 


|EST_HUMAN 


Top Hit Acession 


No. 


1 


1 
8 


IAA332734.1 i 


I 4507834 j 


|AJ251805.1 i 


8 


AJ007932.2 


AB0 12922.1 | 
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|AL161498.2 | 


IAF200446.1 i 
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|BE619650.1 


|BE619650.1 | 


|aI628131.1 
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1 


1X63953.1 


|BF526499.1 


|BE21 9351.1 
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(Top) Hit 


BLAST E 

Value 
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I 3.3E-01 


I 3.3E-01 


1 3.3E-01 


I 3.3E-01 


| 3.3E-01 


1 3.3E-01 


3.3E-01 


| 3.3E-01 


Ui 


I 3.3E-01 


3.3E-01 


| 3.3E-01 
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I 
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4.82 1 
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2.71 


i 3.64 








! 3.26 




i 2.07 




j 2.71 
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ID NO: 
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22737 


23173 


23433 


23582 


1 

a 






8 
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25623 


l 


26148 


ft 


1 


27053 


27542 


27504 




1 28242 


28243 




| 28700 


SEQID 
NO: 


1 
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§ 


3795 




4170 


4477 


5185 


5185! 














1 7337 


I7295 
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? 






























































SEQID 
NO: 


1285 




1704 


1 




1 




3450 


1 


I 


3920 


4271 


1 


5263 


5263 


5622 


I 


6117 


6117 


6575 
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7477 


7507 


7739 


8104 
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GALECTIN-3 (GALACTOSE-SPECIFIC LECTIN 3) (MAC-2 ANTIGEN) (IGE-BINDING PROTEIN) (35 KD 
LECTIN) (CARBOHYDRATE BINDING PROTEIN 35) (CBP 35) (LAMININ-BINDING PROTEIN) (LECTIN 
L-29) (CBP30) 


ob71g02.s1 NC1_CGAP_GCB1 Homo sapiens cDNA clone IMAGE:1336850 3' | 


Rhizobium leguminosarum sym plasmid pRL5JI nodX gene | 


Homo sapiens aldehyde oxidase 1 (AOX1 ), mRNA | 


Pyrococcus horikoshii OT3 genomic DNA, 287001-544000 nt. position (2/7) j 


Rattus norvegicus EH domain binding protein Epsin mRNA, complete cds | 


Arabidopsis thaliana DNA chromosome 4, contig fragment No. 61 I 


Fusarium poae virus 1 RNA2 putative RNA dependent RNA polymerase gene, complete cds | 


P.vulgaris arc5-1 gene I 


LACTOSE PERMEASE (LACTOSE-PROTON SYMPORT) (LACTOSE TRANSPORT PROTEIN) j 


EST369264 MAGE resequences, MAGD Homo sapiens cDNA j 


EST369264 MAGE resequences, MAGD Homo sapiens cDNA | 


Botrytio cinerea strain T4 cDNA library under conditions of nitrogen deprivation | 


60 1 8G8804F1 N IH_MGC_1 7 Homo sapiens cDNA clone IMAGE:41 1 1 51 2 5' j 


Mus musculus Pbx/knotted 1 homeobox (Pknoxl), mRNA j 


Homo sapiens promyelocytic leukemia zinc "finger protein (PLZF) gene, complete cds | 


Humam h NAT allele 3-2 gene for arylamine N-acetyltransferase | 


Arabidopsis thaliana DNA chromosome 4, contig fragment No. 46 | 


Rabbit beta-like globin gene cluster encoding the epsilon, gamma, delta (pseudogene) and beta globin 1 
J polypeptides, complete cds j 


j Homo sapiens jun dimerization protein gene, partial cds; cfos gene, complete cds; and unknown gene | 


HYPOTHETICAL 81 .7 KD PROTEIN C13G7.04C IN CHROMOSOME I PRECURSOR j 


602081972F1 NIH_MGC_81 Homo sapiens cDNA clone IMAGE:4246505 5' | 


Mouse renin (Ren-1 -d) gene, complete cds | 


Homo sapiens interleukin 12p40 subunit (IL12B) gene, IL12B-1 allele, complete cds j 


CVlO-HT0569-060300-239-f1 0 HT0569 Homo sapiens cDNA | 


1 
1 

I 

I 

! 
1 
1 

g 
s 


^.sapiens gene fragment for acetylcholine receptor (AChR) alpha subunit exons 8, 9 and 3' flanking region | 


6018971 07F1 NIH_MGC_19 Homo sapiens cDNA clone IMAGE:4126633 5' | 
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|M60266.1 | 
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Top Hit Descriptor 


1 tp21 a1 1 .x1 NCi_CGAP_Gas4 Homo sapiens cDNA clone IMAGE:2188412 3' similar to gb:D1 5050 NIL-2-A I 
I ZINC FINGER PROTEIN (HUMAN);contains element L1 repetitive element ; | 


|wrC2f10.x1 NCI_CGAP_GC6 Homo sapiens cDNA clone 1MAGE;2480395 3' I 


Izs57d12.r1 NCI_CGAP_GCB1 Homo sapiens cDNA clone IMAGE:701 591 5' similar to contains Alu I 
I repetitive element; j 


|Mus musculus SKD1 (Skd1) gene, complete cds j 


[Mus musculus SKD1 (Skd1) gene, complete cds ■ 


j Mus musculus gene, complete cds, similar to EXLM1 | 


yf77e12.s1 Soares infant brain 1NIB Homo sapiens cDNA clone IMAGE:28291 3' | 


j B.subtilis levanase operon levD, levE, levF, levG and sacC (partial) genes for fructose phosphotransferase I 
j system polypaptides P 1 6, 1 8,28,30 and levanase j 


j B.subtilis levanase operon levD, levE, levF, levG and sacC (partial) genes for fructose phosphotransferase 1 
| system polypeptides P1 6,1 8,28,30 and levanase j 


j Mus musculus Eph receptor A8 (Epha8), mRNA | 


j Bos taurus myosin 1 mRNA, complete cds j 


jMus musculus Fliih protein (Fliih) gene, complete cds; and LIglh protein (LIglh) gene, partial cds | 


jPUTATIVE MULTICOPPER OXIDASE YDR506C | 


Mus musculus major histocompatibility locus class II region; Fas-binding protein Daxx(DAXX) gene, partial 
cds; Bingi (BING1), tapasin (tapasin), RalGDS-like factor (RLF), KE2 (KE2), BING4 (BING4), betal, 3- 
galactosyl transferase (betal ,3-galactosyl tr> 


|601065830F1 N1H_MGC_10 Homo sapiens cDNA clone IMAGE3452287 5' I 


6010B5830F1 NIH_MGC_10 Homo sapiens cDNA clone IMAGE:3452287 5' | 


j Trypanosoma cruzi stage-specific surface glycoprotein gp82 (gp82) mRNA, partial cds | 


(Torpedo californioa mRNA encoding acetylcholine receptor gamma oubunit j 


[ Torpedo californica mRNA encoding acetylcholine receptor gamma subunit ' | 


j ny35h02.s1 NCI_CGAP_Pr12 Homo sapiens cDNA clone IMAGE:1273779 similar to contains LTR8.t2 LTR8I 
repetitive element ; [ 


Campylobacter jejuni NCTC11168 complete genome; segment 5/6 | 


|wz88f05jd NCI_CGAP_Bm25.Homo sapiens cDNA clone IMAGE:2565921 3' similar to contains element I 
MER29 repetitive element ; | 


Homo sapiens TNF-a-inducible RNA binding protein (TIRP) gene, complete cds | 


j601482059F1 N IH_MGC_68 Homo sapiens cDNA clone IMAGE3884559 5' j 


| Chlamydomonas reinhardtii mRNA for nitrite reductase structural locus | 


i Chlamydomonas reinhardtii mRNA for nitrite reductase structural locus | 
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ORFSEQ 
ID NO: 


| 23522 




24059 


| ■ 24064| 


I 24065 | 


24401 | 






25505 | 


2551 3 1 


25838 | 


1 


262341 


1 


1 

8 


1 




1 




1 


1 


1 


25248| 




| 251 88 j 


25189| 


SEQ ID 
NO: 


| 13733, 


I 13901 | 


14280 1 


| 14284j 








1 


1544o| 


1 


15724 | 




1 


16115 


S 


1 




1837S| 


1B375| 




1 


19127 


191 84| 


19212| 


1 


19367| 


'robe 
EQ ID 


3821 


I 


1 


! 




4729 


5218 


5522 








6171 


6218 


a 


1 


1 


1 




1 


I 




9514 


1 


? 

E 




1 


a CO 
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Top Hit Descriptor 


Rattus norvegicus A-kinase anchoring protein AKAP1 50 mRNA, complete cds j 


| Guira guira oocyte maturation factor Mos (c-mos) gene, partial cds j 


I501148733F1 NIH MGC_19 Homo sapiens cDNA clone IMAGE:3163688 5' j 


1601 148733F1 NIH_MGC_19 Homo sapiens cDNA clone IMAGE:3163688 5' | 


j Human mRNA for serine/threonine protein kinase, complete cds j 


(QV1-CT0364-120200-065-b05 CT0364 Homo sapiens cDNA j 


DKFZp586l2321 r1 586 (synonym: hutel) Homo sapiens cDNA clone DKFZp586l2321 | 


1 hd44b03.x1 Soares._NFL_T_GBG_S1 Homo sapiens cDNA clone IMAGE.291 2333 3' | 


Escherichia coli K-12 MG1655 section 384 of 400 of the complete genome | 


Escherichia coli K-12 MG1655 section 384 of 400 of the complete genome f 


Arabidopsis thaliana DNA chromosome 4, contig fragment No. 65 j 


1 

! 

is 
1 

E 

I 

1 
1 


Toxoplasma gondii 90kDa heat-shock protein (HSP90) mRNA, partial cds j 


B.taurus microsatellite (ETH121) | 


X 

s 

i 

i 
i 

s 


Pyrococcus horikoshii OT3 genomic DNA, 777001-994000 nt. position (4/7) [ 


Borrelia burgdorferi (section 66 of 70) of the complete genome j 


bv44g10.x1 Soares_testis_NHT Homo sapiens cDNA clone IMAGE:1640226 3' similar to contains Alu I 
repetitive element;contains element MER22 repetitive element ; j 


Mus musculus chromosome X contigA; putative Magea9 gene, Caltractin, NAD(P) steroid dehydrogenase I 
and Zinc finger protein 185 | 


RNA POLYMERASE BETA SUBUNIT (LARGE STRUCTURAL PROTEIN) (L PROTEIN) | 


Human mRNA for transcription factor AREB6, complete cds i 


Human mRNA for transcription factor AREB6, complete cds | 


Bovine adenovirus 3 complete genome | 


j602042601F1 NCI_CGAP_Brn67 Homo sapiens cDNA clone IMAGE:4180129 5' j 


ql59c11jc1 Soares_NhHMPu_S1 Homo sapiens cDNA clone IMAGE:1876628 3' similar to contains Alu I 
repetitive element;contains element LTR5 repetitive element ; 


|EST57072 Infant brain Homo sapiens cDNA 5' end j 


Homo sapiens OCTN2 gene, complete cds | 


Mesembryanthemum crystellinum fructose-biphosphate aldolase mRNA, complete cds j 


Mesembryanthemum crystallinum fructose-biphcsphate aldolase mRNA, complete cds | 


;UI-H-BI4-aoi-f-04-0-Ul.s1 NCI_CGAP_Sub8 Home sapiens cDNA clone IMAGE:3085182 3' j 


Top Hit 
■ Database 
Source 


NT | 


!nt 1 


|EST HUMAN | 


EST HUMAN | 


!NT | 


EST HUMAN | 


EST HUMAN | 


EST HUMAN f 


!NT I 




I NT | 


INT | 


NT | 


INT | 




:nt i 


NT | 


EST HUMAN | 




iSWISSPROT j 




NT ! 


NT | 


EST HUMAN j 


EST HUMAN I 


I 

z> 

X 


NT | 


I IN 


,NT | 


:est_human l 


Top Hit Acession 
No. 


|U67136.1 | 


|AF168050.1 ] 


I 


|BE313442.1 j 




|AW860020.1 | 


|AL047620.1 j 


|AW51 1195.1 | 


|AE000494.1 i 


|AE000494.1 | 


|AL1 51 565.2 | 


|AB020975.1 


|AF1 79480.1 j 


| Z1 4037.1 | 


|Z14037.1 | 


|AF0000D4.1 | 


|AE001 180.1 I 


|aI090868.1 j 


|aL021127.2 I 


|P13615 | 


i 

& 


|D15050.1 j 


|AF030154.1 | 


1 


| 


|AA349997.1 j 


1 


|AF003124.1 j 


|AF003124.1 j 


|BF511215.1 i 


Most Similar 
(Top) Hit 
BLAST E 
Value 


| 2.8E-01 




| 2.8E-01 


2.8E-01 


! 2.8E-01 


j 2.8E-01 


2.8E-01 


i 2.8E-01 


j 2.8E-01 


2.8E-01 


2.8E-01 


! 2.8E-01 




2.8E-01 


2.8E-01 


2.8E-01 


2.8E-01 


2.8E-01 


2.8E-01 


2.8E-01 


2.8E-01 


2.8E-01 


2.8E-01 


2.8E-01 


2.8E-01 


9 


2.8E-01 


s 


2.8E-01 


[ 2.8E-01 


Expression 
Signal 


I 1-841 


I 2.47| 








1.55 




I 147| 






I 2.47| 


I " 1.37| 




I 2.48| 








2.21 1 


1 




0 


q 










| 2.33| 


5 




I 7.87| 


ORFSEQ 

ID NO: 




| 20828 








1 


s 


1 


| 22197, 


8 




| 22388| 






8 


j 23057[ 


| 23604| 




24034| 


i 24038| 


! 24352| 


1 


24394| 


| 24427[ 


? 


% 


| 2541 9| 


! 25784| 


| 25785| 


| 26073; 














SEQID 

NO: 


10497 


■ 


1 


1 




1 


1 




1 2300 




2375 


1 


I 


1 


i 


I3252 


! 3824! 


:4003 




1 


s 






1 






1 5363 


5677 


:5677| 


1 






























































Probe 
SEQID 
NO: 








1256 


8 


1 


1 


| 2084 


| 2423 


I 


| 2500 [ 


| 2630 | 


| 2940 i 


| 2941 1 


| 2941 1 


| 3332 j 


| 3915| 


I 410s| 


| 4352) 


1 43571 




| 4673 | 


1 4722[ 


| 4755 


I 4787 




| 5443 




| 5770 1 


1 
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Top Hit Descriptor 


qp48h01.x1 NC1_CGAP Co8 Homo sapiens cDNA clone IMAGEM 926289 3' similar to gb:X06323_cds1 
MITOCHONDRIAL 60S RIBOSOMAL PROTEIN L3 (HUMAN); 


qp48h01 .xl NCI_CGAP_Co8 Homo sapiens cDNA clone IMAGE:1926289 3' similar to gb:X06323_cds1 I 
MITOCHONDRIAL 60S RIBOSOMAL PROTEIN L3 (HUMAN); j 


Homo sapiens lanosterol 14-aIpha demethylase cytochrome P450 (CYP51) gene, exon 5 | 


602022987F1 NCI_CGAP_Bm67 Homo sapiens cDNA clone IMAGE:41 58525 5' | 


Mus musculus centrin (Cetn2) gene, complete cds | 


Lycopersicon esculentum peroxidase (TPX1 ) mRNA, complete cds 


Homo sapiens hypothetical protein (LOC51319), mRNA | 


60188Q794F1 NIH_MGC_55 Homo sapiens cDNA clone IMAGE:41Q9350 5' i 


601880794F1 NIH MGC 55 Homo sapiens cDNA clone IMAGE:4109350 5' ! 


601852148F1 NIH_MGC_56 Homo sapiens cDNA clone IMAGE:4076026 5' i 


Drosophila heteroneura fruitless (fru) gene, aliernative splice products, 5' flanking region, exons 1 through 7 I 
and complete cds 


6021 3741 8F1 NIH_MGC_83 Homo sapiens cDNA clone IMAGE:4273853 5' 


Mus musculus DNA for prostaglandin DV synthase, complete cds j 


PM4-HT0606-030400-001-a07 HT0606 Homo sapiens cDNA ) 


Homo sapiens CDC42-binding protein kinase beta (DMPK-like) (CDC42BPB), mRNA | 


Rattus norvegicus CDK104 mRNA j 


zx39b10.s1 Soares_total_fetus_Nb2HF8_9w Homo sapiens cDNA clone IMAGE:788827 3' similar to j 
contains Alu repetitive element; | 


Ipomoea purpurea transposable element Tip1 00 gene for transposase, complete cds j 


I 
J 


! 
1 

1 
1 

D 

•5. 

| 

X 

s 

x 

1 
i 
1 
1 

73 


GAG POLYPROTEIN [CONTAINS: INNER COAT PROTEIN P12; CORE PROTEIN P15; CORE SHELL 
PROTEIN P30; NUCLEOPROTEIN P10] 


Rattus norvegicus vesicular monoamine transporter type 2, promoter region and exon 1 | 


Feline immunodeficiency virus env gene, isolate 1 1 1 U088PIU (M88), partial j 


ta43c1 1 .x2 NCI_CGAP_Lu25 Homo sapiens cDNA clone IMAGE:2046836 3' similar to contains element L1 I 
repetitive element ; 


CM1 -HT0875-060900-385-e05 HT0875 Homo sapiens cDNA ] 


wo92e1 1 ,x1 NCI_CGAP_Kld1 1 Homo sapiens cDNA clone IMAGE:2462828 3' | 


Homo sapiens DiGeorge syndrome critical region, telomeric end i 


Triticum aestivum (Wcs66) gene, complete cds . I 


Top Hit 
Database 


z 

1 

D 


| EST HUMAN | 


NT I 


EST.HUMAN | 


1N| 


NT | 


2 


EST HUMAN | 


EST HUMAN I 


EST_HUMAN [ 




H 


| 


EST HUMAN j 


NT I 


z 


Iesthuman I 


2 


NT | 


EST HUMAN | 


| SWISSPROT | 


2 


NT | 


EST HUMAN | 


| 

X 


EST HUMAN | 


NT | 


NT | 


Top Hit Acession 


AI346126.1 


1 

s 


U51 688.1 


l 


AF080592.1 


L1 3654.1 


P 

s 


BF241 062.1 


BF241 062.1 


i 
i 

m 


< 


BF674023.1 


□83329.1 


BE1 78699.1 j 


S 


Y1 7324.1 


AA450061.1 


I 


X79815.1 


W58067.1 


P03341 


I 


> 


A1310858.1 


s 

2 


< 


L77569.1 


L27516.1 


Most Similar 
(Top) Hit 
BLAST E 
Value 


2.8E-0l| 


2.8E-01 


2.8E-01 1 


2.8E-0lj 


2.SE-01 1 


2.8E-01| 


I 


2.8E-01| 


2.8E-01I 


2.8E-01| 


2.8E-0i| 


2.8E-0l| 


2.8E-01 j 


2.8E-01 1 


2.8E-01| 


2.7E-01| 


2.7E-01 1 


2.7E-01 [ 


2.7E-01[ 


2.7E-01| 


2.7E-0l| 


1 


2.7E-01| 


2.7E-01 1 


2.7E-01| 


2.7E-01| 


2.7E-01 1 


2.7E-01| 


Expression 
Signal 


: 1-24 






7.25| 


1.351 


1.151 




2.27| 


2.27| 












! 








1.73| 


2.96| 






8.78| 


3.51 1 


1.25 






0.89| 


ORFSEQ 
ID NO: 


| 26781 




I 




27388 




1 




28262 1 


I 








25236 | 








§ 










22094 1 


221 80 | 




2361 8| 


s 




Exon 
SEQ ID 
NO: 














































12197) 








1 


1 


Probe 
SEQ ID 
NO: 


| 6713 


| 6713) 


6778 1 


1 


7311 


7600 1 




81 26 1 


81261 


81 53 1 




I 


9552 1 


8 


1 


1 




1 






B 


1 


2316| 




2956 1 




8 


4796 1 
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Top Hit Descriptor 


]RC1-CT0286-230200-016-e03 CT0286 Homo sapiens cDNA 


HOMEOBOX PROTEIN HOX-A4 (CHOX-1 .4) | 


FI3RILUN 1 PRECURSOR 


|bos taurus micromolar calcium activated neutral protease 1 (CAPN1) gene, exons 1 1-20, and partial cds | 


Jbos taurus micromolar calcium activated neutral protease 1 (CAPNI)gene, exons 11-20, and partial cds | 


JTHREONYL-TRNA SYNTHETASE (THREONINE— TRNA LIGASE) (THRRS) | 


j THREONYL-TRNA SYNTHETASE (THREONINE— TRNA LIGASE) (THRRS) | 


z 

0 

< 
1 


Oryotolagus cuniculus calgranulin C mRNA, partial cds 


Mus musculus transcriplion factor NF-ATc isoform a (NF-ATca) mRNA, complete cds ! 


AV705043 ADB Homo sapiens cDNA clone ADBCOD05 5' ! 


AV705043 ADB Homo sapiens cDNA clone ADBCOD05 5' j 


Homo sapiens caveolin-1 1-2 locus, Contigl, D7S522, genes CAV2 (exons 1, 2a, and 2b), CAV1 (exons 1 andl 

|2) ' , I 


| G.gallus mRNA for ryanodine receptor type 3 j 


'Homo sapiens fragile 1 3D oxido reductase (FOR) gene, exon 6 | 


IROQUOIS-CLASS HOMEODOMAIN PROTEIN IRX-2 i 


] Bos taurus mRNA for mb-1 , complete cds | 


601510838F1 NIH_MGC_71 Homo sapiens cDNA clone IMAGE:3912345 5' | 


Glycine max pseudogene for Bd 30K | 


Arabidopsis thaliana DNA chromosome 4, contig fragment No. 2 j 


Arabidopsis thaliana DNA chromosome 4, contig fragment No. 2 | 


Db04d10.x1 NIH_MGC_14 Homo sapiens cDNA clone IMAGE:2958451 3' similar togb:M36072 60S 
RIBOSOMAL PROTEIN L7A (HUMAN); gb:M14689_cds1 Mouse surfeit locus surfeit 3 protein gene 
(MOUSE); 


Human prealbumin gene, complete cds j 


1 

i 
1 

1 


601126016F1 NIH MGC 9 Homo sapiens cDNA clone IMAGE:2990043 5' I 


EST386635 MAGE resequences, MAGM Homo sapiens cDNA | 


hv30c02.x1 NCI_CGAP_Lu24 Homo sapiens cDNA clone !MAGE:3174914 3' similar to contains L1 .t3 L1 I 
j repetitive element; | 


| Bacteriphage T2 DNA-(adenine-N6)methyltransferase (dam) gene, complete cds 


sje 


iUMAN | 


iPROT | 


iPROT | 






iPROT | 


iPROT | 


iPROT | 






IUMAN j 


| 








.PROT | 




< 
5 








I 






| 


IUMAN | 








jEST h 




s 


| NT 


I NT 


| SWISS 


<A 
% 


1 


| NT 


| NT 


jEST F 


Iest f 


NT 


jNT 


'NT 


w 


iNT 


jEST F 


|NT 


jNT 


|nt 


EST F 


| NT 


jNT 


jEST F 


jEST F 


EST V 


| NT 


Top Hit Acession 
No. 


IAW856131.1 | 


jP17277 i 


IQ61554 


IAF248054.1 


AF248054.1 


I 
§ 


1 
8 


1 


IAF091 848.1 i 


IAF087434.1 


IAV705043.1 | 


|AV705043.1 i 


IAJ133269.1 


|X95267.1 


!AF21 7491.1 | 


1 


I 

Q 


| 


IAB013290.1 i 


IAL161472.2 | 


1 

< 


AW733152.1 


IM11844.1 


,Y1 2996.1 ' 


IBE272440.1 j 


IAW974531.1 | 


BE217816.1 


JM22342.1 


Most Similar 
(Top) Hit 
BLAST E 
Value 


2.7E-01[ 


2.7E-01 1 


S 


2.7E-01 1 


2.7E-01 1 


2.7E-01 1 


2.7E-01 1 


2.7E-01| 


2.7E-01 1 


2.7E-01| 


2.7E-01I 


2.7E-01 1 


2.7E-01 1 


2.7E-01 1 


2.7E-01| 


1 


2.6E-01I 


2.6E-01| 


s 


2.6E-01 1 


2.6E-01 1 


2.6E-01 


I 


2.6E-01 1 


2.6E-01I 


2.6E-01 1 


2.6E-01 1 


2.6E-01 1 


Expression 
Signal 




3.49| 








1 




2.24| 




8 










2.15| 






2.23 1 






s 






s 






O.94I 


| 1.121 


ORFSEQ 
ID NO: 




| ' 24838 | 


8 


| 26565 | 




| 27453 | 


1 






j 27921 | 


I 28322I 


I 28323 1 


28333 | 










j 21133| 


I 211881 


| 21624| 


| 




1 










§ 


SEQ ID 
NO: 


i 


| 15144 


1 












































I 12374, 


% 


I 13401 


1 


Probe 
SEQ ID 
NO: 


1 


| 5221 


S 
8 


| 6527| 


| 6527| 


| 7378 | 


s 


s 

K 


| 7801 1 


| 7827| 


I 81871 


I 81871 


| 81 97 1 


8 


| 9810| 


1 




| 1371| 


1 14181 


| 1854| 


| 1854| 


I 




I 2424' 


1 


| 3053 j 


I 3485 


1 
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Top Hit Descriptor 


| Homo sapiens acetylcholinesterase collagen-like tail subunit (COLQ) gene, exons 1 A, 2, 3, 4, and 5 ' j 


] Rattus norvegicus mRNA for organic anion transporter 3, complete cds ] 


| EST371 580 MAGE resequences, MAGF Homo sapiens cDNA 


j QV1 -BT0630-040400-1 32-e03 BT0630 Homo sapiens cDNA . | 


j Enterococcus faeclum strain N97-330 vanD glycopeptide resistance gene cluster, complete cds; and 1 
j unknown gene | 


] Gallus gallus mRNA for skeletal myosin heavy chain, complete cds | 


! Gallus gallus mRNA for skeletal myosin heavy chain, complete cds | 


j as89d07.r1 Stratagene fetal retina 937202 Homo sapiens cDNA clone IMAGE:838477 5' j 


jArabidopsis thaliana PSI type III chlorophyll a/b-binding protein (Lhca3*1) mRNA, complete cds j 


] Ophrestia radicosa maturase-like protein (matK) gene, complete cds; chloroplast gene for chloroplast product | 


yj51 e05.r1 Soares placenta Nb2HP Homo sapiens cDNA clone IMAGE:1 52288 5' ! 


|ACYL-COA DEHYDROGENASE, MEDIUM-CHAIN SPECIFIC, MITOCHONDRIAL PRECURSOR (MCAD) | 


jThermotoga maritima section 123 of 136 of the complete genome j 


ts02e12.x1 NCI CGAP Pan1 Homo sapiens cDNA clone IMAGE:2227438 3' similar to SW:NDF1 RAT 
Q64289 NEUROGENIC DIFFERENTIATION FACTOR 1 contains element LTR1 repetitive element ; 


ts02e12.x1 NCI_CGAP_Pan1 Homo sapiens cDNA clone IMAGE:2227438 3' similar toSW:NDF1_RAT 
Q64289 NEUROGENIC DIFFERENTIATION FACTOR 1 contains element LTR1 repetitive element ; 


Iyf37a03.s1 Soares fetal liver spleen 1NFLS Homo sapiens cDNA clone IMAGE:129004 3' similar to \ 
|gb:X12517 U1 SMALL NUCLEAR RIBONUCLEOPROTEIN C (HUMAN); | 


jye82a07.r1 Soares fetal liver spleen 1NFLS Homo sapiens cDNA clone IMAGE:124212 5' \ 


|602014422F1 NCl_CGAP_Bm64 Homo sapiens cDNA clone IMAGE:4150396 5' \ 


'HYPOTHETICAL 75.2 KD PROTEIN C11C11.02 IN CHROMOSOME II | 


RC5-ET0082-310500-021-F10 ET0082 Homo sapiens cDNA | 


IRC5-ET0082-310500-021-F10 ET0082 Homo sapiens cDNA 


jVON WILLEBRAND FACTOR PRECURSOR (VWF) ! 


] Human lambda-immunoglobulin constant region complex (germline) | 


| Mus musculus jerky (Jrk), mRNA | 


J601511052F1 NIH_MGC_71 Homo sapiens cDNA clone IMAGE:3912612 5' [ 


| Homo sapiens Na/K-ATPase gamma subunit (FXYD2) gene, complete cds, alternatively spliced ] 


Top Hit 
Database 
Source 




£ 


ESTJHUMAN | 


EST HUMAN | 


z 


7. 


z 


EST_HUMAN j 


Z 


z 


1 
ro 


SWISSPROT [ 


z 


EST HUMAN 


| 


EST HUMAN | 


EST_HUMAN I 


EST HUMAN | 


SWISSPROT | 


EST HUMAN | 


EST HUMAN j 


: SWISSPROT [ 


z 


z 


EST_HUMAN ( 


z 


Top Hit Acession 
No. 


AF229118.1 | 


|AB017446.1 | 


jAW959510.1 | 


1 


1 
< 


|AB021 180.1 j 


|AB021 180.1 [ 


|AA457617.1 ! 


1 

3 




1 

s 




jAE001811.1 ] 


1 


AI582557.1 




|R02411.1 I 


|BF343588.1 | 


1 

5 


jBE830339.1 | 




|Q28295 | 


1X51755.1 | 




1BE883491.1 - | 




Most Similar 
(Top) Hit 
BLAST E 
Value 


2.6E-0l| 


2.6E-01| 




2.6E-01| 


2.6E-01 


| 2.6E-01 j 


I 2.6E-01| 


2.6E-01 1 


1 


2.6E-01I 


1 






2.6E-01 


2.6E-01 


2.6E-01 1 


2.6E-01| 


2.6E-01| 


1 


2.6E-01| 




2.6E-01 1 


2.6E-01 1 


1 


! 2.6E-01| 


1 


Expression 










! 0.95 


! 0.84j 










3.56| 








1.93 






I 3.01! 




1 






93.65[ 




I 1.92| 




ORFSEQ 
ID NO: 


23295 






| 23742 I 


23935 


24072 j 


R 


241 34 | 


24224 1 


24292 I 


24554 | 


1 








1 


1 


1 


1 


i 27112| 


j 271131 










| 25309! 


SEQ ID 
NO: 


| 13505 1 




| 13916| 




| 14158| 


l 






















1 


! 167311 


1 


1 


1 169231 


| 17704| 






| 19655| 


| 19053| 


Probe 
SEQ ID 

NO: 


1 


| 3651 


| 4010 


1 


| 4259; 


| 4393 1 


I 


| 4448 [ 




| 4eie| 


1 


| 5051 | 


I 5700 j 


I 


5763 


| 6552 


| 65851 


1 


1 


I 7046 


1 7046 


| 7854 


1 


1 9136 




1 
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Top Hit Descriptor 


Homo sapiens inositol polyphosphate 1 -phosphatase (INPP1) gene, complete cds j 


ADENYLATE CYCLASE (ATP PYROPHOSPHATE-LYASE) (ADENYLYL CYCLASE) j 


Homo sapiens ATP synthase, H+ transporting, mitochondrial F1 complex, delta subunit (ATP5D), nuclear I 
gene encoding mitochondrial protein, mRNA ! 


Homo sapiens ATP synthase, H+ transporting, mitochondrial F1 complex, delta subunit (ATP5D), nuclear I 
gene encoding mitochondrial protein, mRNA j 


Starfish (P.ochraceus) cytoplasmic actin gene, complete cds | 


| Mus musculus ICR/Swiss glyceraldehyde 3-phosphate dehydrogenase (Gapd-S) gene, complete cds I 


Ureaplasma urealyticum section 57 of 59 of the complete genome | 


1 

1 
$ 

I 
1 

X 

R 
1 

1 

Y 


Botrvtis cinerea strain T4 cDNA library under conditions of nitrogen deprivation j 


Homo sapiens hyperpolarization activated cyclic nucleotide-gated potassium channel 4 (HCN4) mRNA I 


PM4-CT0400-31 0700-005-dOB CT0400 Homo sapiens cDNA | 


PM4-CT0400-31 0700-005-d08 CT0400 Homo sapiens cDNA I 


Aquifex aeolicus section 7 of 1 09 of the complete genome | 


zs11a12.r1 NCI CGAP_GCB1 Homo sapiens cDNA clone IMAGE:684862 5' | 


B.taurus mRNA for D-aspartate oxidase | 


EST385464 MAGE resequences, MAGM Homo sapiens cDNA 


Arabidopsis thaliana DNA chromosome 4, contig fragment No. 29 j 


wg11c07.x1 Soares NSF F8 9W OT PA P S1 Homo sapiens cDNA clone IMAGE:2364780 3' | 


wg11c07.x1 Soares NSF F8 9W OT PA P S1 Homo sapiens cDNA clone IMAGE:2364780 3' j 


A-AGGLUTININ ATTACHMENT SUBUNIT PRECURSOR i 


RHIB PROTEIN 


Mus musculus neuronal apoptosis inhibitory protein 6 (Naip6) gene, complete cds; and Naip3 gene, exons 2-9l 
and 11-16 


MOLT-INHIBITING HORMONE PRECURSOR (MIH) 


Choristoneura fumiferana diapause associated protein 2 (DAP2) mRNA, complete cds | 


Vibrio cholerae chromosome II, section 73 of 93 of the complete chromosome | 


Mus musculus annexin V gene, intron 4 segment containing 5' LTR and gag portion of MuERV-L (murine I 
endogenous retrovirus) element [ 
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| 


Mus musculus gene for uncoupling protein 3, 5'-flanking region and partial 5'UTR | 


hh75f09.y1 NCI_CGAP_GU1 Homo sapiens cDNA clone IMAGE:2968649 5' similar to contains TAR1.t2 I 
JTAR1 repetitive element ; I 


hh75f09.y1 NCI_CGAP_GU1 Homo sapiens cDNA clone IMAGE:2968649 5' similar to contains TAR1 .t2 I 
TAR1 repetitive element ; | 


T3 receptor-associating cofactor-1 [human, fetal liver, mRNA, 2930 nt] 


Mus musculus SKD1 (Skd1 ) gene, complete cds ! 


Homo sapiens chromosome 21 segment HS21C082 j 


7l57a03.x1 Soares_NSF_F8_9W_OT_PA_P_S1 Homo sapiens cDNA clone IMAGE:3G25389 3' | 


601459238F1 NIH_MGC_S6 Homo sapiens cDNA clone IMAGE:3862809 5' j 


yq84f07.r1 Scares fetal liver spleen 1NFLS Homo sapiens cDNA clone IMAGE:202501 5' | 


Homo sapiens matrix metalloproteinase MMP Rasi-1 gene, promoter region j 


Hordeum vulgare receptor-like kinase LRK10 gene, partial cds | 


Hordeum vulgare receptor-like kinase LRK10 gene, partial cds I 


RC3-ST0 1 86-1 30 1 00-0 1 5-a07 ST0 1 85 Homo sapiens cDNA j 


xg40c1O.x1 NCI_CGAP_Ut1 Homo sapiens cDNA clone IMAGE:2630034 3' similar to contains Alu repetitive I 
|element;contains element MSR1 repetitive element ; | 


Mouse L1Md LINE DNA j 


Human mRNA for KIAA01 24 gene, partial cds | 


Zea mays cellulose synthase-4 (CesA-4) mRNA, complete cds i 


Arabidopsis thaliana DNA chromosome 4, contig fragment No. 41 j 


Spodoptera frugiperda CALNU C mRNA, complete cds I 


602132442F1 NIH_MGC_81 Homo sapiens cDNA clone IMAGE:4271578 5' i 


Homo sapiens KIAA0851 gene (partial), XT3 gene and LZTFL1 gene j 


Homo sapiens KIAA0851 gene (partial), XT3 gene and LZTFL1 gene j 


Homo sapiens FLI-1 gene, partial | 


Mesembryanthemum crystallinum putative potassium channel protein Mkt1 p mRNA, complete cds j 


Zaocys dhumnades fructose-1 ,6-bisphosphatase mRNA, complete cds j 


Homo sapiens serine palmitoyi transferase, subunit II gene, complete cds; and unknown genes i 


IMMUNOGLOBULIN A1 PROTEASE PRECURSOR (1GA1 PROTEASE) | 


Aquifex aeolicus section 1 2 of 1 09 of the complete genome j 
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1 

i 

: 


7h23d04.x1 NCI_CGAP_Co16 Homo sapiens cDNA clone IMAGE:3316807 3' similar to SW:PRSB_XENLA I 
' 042586 26S PROTEASE REGULATORY SUBUNIT 6A ; 


]D.discoideum(Ax3-K)ponAgene | 


£ 
1 


| Bovine adenovirus 3 complete genome 


|Oryza longistaminata receptor kinase-like protein, family member D, and retrofit (gag/pol) genes, complete cds| 


j H. sapiens AGT gene, PstI fragment of intron 4 i 


|QV1-HT0412-020400-136-b10 HT0412 Homo sapiens cDNA j 


Hepatitis C virus genomic RNA for polyprotein, complete cds | 


Mus muscuius Wrn protein (Wrn) gene, complete cds | 


Mus muscuius Wrn protein (Wrn) gene, complete cds | 


7i54d04.x1 NCI CGAP Br1 6 Homo sapiens cDNA clone IMAGE:3338503 3' similar to SW :SFR4 HUMAN 
Q081 70 SPLICING FACTOR, ARGININE/SERINE-RICH 4 ;contains element TAR1 TAR1 repetitive element 


] Drosophila melanogaster p38a MAP kinase gene, complete cds 1 


|Homo sapiens HSPC142 protein (HSPC142), mRNA j 


Iwce2c11.x1 NCI_CGAP_Pan1 Homo sapiens cDNA clone IMAGE:2323220 3' similar to gb.J03464 I 
PROCOLLAGEN ALPHA 2(l) CHAIN PRECURSOR (HUMAN); i 


Bos taurus guanylyl cyclase-activating protein 2 (guca2) mRNA, complete cds ] 


i etrahymena thermophila macronuclear gene encoding ribosomal protein L3, exons 1-2 | 


Wd43e02.x1 Soares_NFL_T_GBC_S1 Homo sapiens cDNA clone IMAGE:2330906 3' similar to contains I 
MER22.M TAR1 repetitive element ; ! 


| COLLAGEN ALPHA 1 (X) CHAIN PRECURSOR | 


| Arabidopsis thaliana DNA chromosome 4, contig fragment No. 6 [ 


j Mus muscuius type 1 sigma receptor gene, complete cds | 
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I 
i 

s. 


Homo sapiens fragile 1 5D oxido reductase (FOR) gene, exon 6 | 


Arabidopsis thaliana ethyiene-insensitive3-like1 (EIL1 ) mRNA, complete cds | 


Mus muscuius mRNA for putative mc7 protein (mc7 gene) | 


i Gallus gallus gene coding for a-actin | 


|RC3-CT0413-100800-023-b06 CT0413 Homo sapiens cDNA | 


j Homo sapiens chromosome 21 segment HS21 C081 | 


J aromatase [Poephila guttata=zebra finches, ovary, mRNA, 3188 nt] j 
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Top Hit Descriptor 


| Mycoplasma genitalium section 35 of 51 of the complete genome j 


j Methanococcus jannaschii section 1 38 of 1 50 of the complete genome { 


J601142073F1 NIH_MGC_14 Homo sapiens cDNA clone IMAGE:350581 8 5' j 


j Mjs musculus vacuolar protein sorting 4b (yeast) (Vps4b), mRNA i 


{yersinia pestis HmsH (hmsH), HmsF (hmsF), HmsR (hmsR), and HmsS (hmsS) genes, complete cds | 


j Brassica napus slg gene for S-locus glycoprotein, cultivar T2 i 


| Mus musculus cdh5 gene, exon 1 , partial | 


| Homo sapiens partial intron 3 of the wild type AF-4/FEL gene j 


601175562F1 NIH_MGC_17 Homo sapiens cDNA clone IMAGE:3531015 5' j 


j Human erythropoietin gene, complete cds ] 


j Marinilabilia sgarovorans gyrB gene for DNA gyrase subunit B, partial cds, strain:IFO 14957 | 


Ino16d06.s1 NCI_CGAP_Phe1 Homo sapiens cDNA clone IMAGE:1 100843 3' similar to contains Alu j 
'■ repetitive element;contains element THR repetitive element ; I 


jyh21b07.s1 Soares placenta Nb2HP Homo sapiens cDNA clone IMAGE:130357 3' ! 


jyr97h10.r1 Soares fetal liver spleen 1NFLS Homo sapiens cDNA clone IMAGE:213283 5' i 


|GSTA5=glutathione S-transferase Yc2 subunit {5' region, intron 1} [rats, Morris hepatoma cell line, Genomic, I 
|2212nt, segment 1 of 3] 


j Homo sapiens K1AA0450 gene product (KIAA0450), mRNA j 


jyj17f01 .r1 Soares placenta Nb2HP Homo sapiens cDNA clone IMAGE:149017 5' | 


|Mjs musculus renin (Ren-1c) gene, promoter region | 


jSynechocystis sp. PCC6803 complete genome, 1/27, 1-133859 | 


IHomo sapiens mifogen-Ectivatcd protein Wnase p38dclta (PRKM13) mRNA, complete cds 


| Homo sapiens nuclear transport factor 2 (placental protein 1 5) (PP1 5) mRNA 


| Human phenylethanolamine N-methyltransferase gene, complete cds [ 


|601896136F1 NIH_MGC_19 Homo sapiens cDNA clone IMAGE:4125368 5' | 


Human hereditary haemochromatosis region, histone 2A-like protein gene, hereditary haemochromatosis 
(HLA-H) gene, RoRet gene, and sodium phosphate transporter (NPT3) gene, complete cds 


Escherichia coli K-12 MG1 655 section 130 of 400 of the complete genome j 


| Homo sapiens mRNA for KIAA1512 protein, partial cds | 


7k30b06.x1 N CI_CGAP_Ov1 8 Homo sapiens cDNA clone IMAGE:3476699 3' similar to SW :GAG_SMSAV 
P03330 GAG POLYPROTEIN [CONTAINS: CORE PROTEIN P15; INNER COAT PROTEIN P12; CORE 
SHELL PROTEIN P30; NUCLEOPROTEIN P1 0]. ; 


Top Hit 
Database 


7 


7 


lESTHUMAN j 


7 


7 


7 


7 


7 


EST HUMAN [ 


7 


7 


EST HUMAN 


EST_HUMAN | 


EST HUMAN I 


7 


7 


;EST HUMAN | 


7 


7 


7 


7 


7 


EST HUMAN [ 


7 


z 


7 


i 


Top Hit Acession 
No. 


jU39713.1 | 


1 


|BE311893.1 | 


| 667798C| 


|u22837.2 | 


|AJ245480.1 i 


| Y1 0887.2 i 


1 

s- 


jBE297718.1 | 


|M11319.1 I 


|AB015033.1 j 


|aA501 379.1 | 


jR217321 j 




S82821.1 I 




1R82252.1 | 


l 






s 


jJ03280.1 I 


|BF316135.1 | 


U91328.1 


iAE000240.1 I 


JAB040945.1 | 


BF058381.1 


I.LU 
E I a> 


s 


s 




<? 


s 


s 










S 


9 


s 


S 


s 


s 


1 




s 


s 


S 


s 


s 


S 


S 


s 




Most Si 
(Top) 
BLAS" 
Valu 






i 








































a 










Expression 
Signal 


I 4.42| 










s 






s 














3.72| 


I 0.85j 














0.98 


1 






ORFSEQ 
ID NO: 




| 20403 i 




| 




21351 | 






221691 






1 




23051 I 


23461 | 




! 23926 j 




1 




241 40 | 


1 


I 




24783 | 


1 


1 


Exon 
SEQ ID 
NO: 


| 10558| 


| 10587| 




| 11398| 
































14282| 






| 14877 1 






1 


15252 


SEQ ID 
NO: 








| 1494 


| 154S| 


| 1586| 


I 16141 


1 


1 




1 


i 


| 3045 1 


I 3324I 


| 376e| 


8 


| 4253 1 


1 


1 43491 




| 4454 | 


1 


| 5002 


• 

5101 


| 514S, 


5 

a 


1 



162 



WO 01/57274 



PCT/US01/00666 




WO 01/57274 PCT/US01/00666 




WO 01/57274 



PCT/US01/00666 



Top Hit Descriptor 


| PM3-CT0263-241299-009-b07 CT0263 Homo sapiens cDNA j 


Mus musculus deformed epidermal autoregulatory factor 1 (Drosophila) (Deafl), mRNA [ 


S MR1-TN0045-1 10900-O06-CO2 TN0045 Homo sapiens cDNA | 


za04f08.r1 Soares melanocyte 2NbHM Homo sapiens cDNA clone IMAGE:291 591 5' | 


I LARGE PROLINE-RICH PROTEIN BAT2 (HLA-B-ASSOCIATED TRANSCRIPT 2) j 


; Brschydanio rerio ependymin beta and gamma chains (Epd) gene, complete cds j 


i Funaria hygromelrica ohioroplast-localized small heat shock protein (CPsHSP21 ) mRNA, complete cds; j 
J nuclear gene for chloroplast product j 


;601869724F1 NIH_MGC_19 Homo sapiens cDNA clone IMAGE:4100189 5' I 


iD-osophila68C glue gene cluster j 


Homo sapiens H-2K binding factor-2 (LOC51 580), mRNA | 
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I 


Homo sapiens chromosome Xq28 melanoma antigen family A2a (MAGEA2A), melanoma antigen family A1 2 
(MAGEA1 2), melanoma antigen family A2b (MAGEA2B), melanoma antigen family A3 (MAGEA3), caltractin 
(CALT), NAD(P)H dehydrogenase-like protein (NSDHL), and Ll> 


Vitis vinifera cultivar Pinot Noir plasma membrane aquaporin (PlP1a) mRNA, complete cds j 


1 RC1 -CT0249-1 41 1 99-021 -g04 CT0249 Homo sapiens cDN A | 


;AV694801 GKC Homo sapiens cDNA clone GKCAHB02 5' | 


nm31e11.s1 NCI_CGAP_Lip2 Homo sapiens cDNA clone IMAGE1061804 | 


| Arabidopsis thaliana DNA chromosome 4, contig fragment No. 16 j 


Chlamydia muridarum, section 45 of 85 of the complete genome | 


! Mus musculus interferon (alpha and beta) receptor 2 (Ifnar2), mRNA ) 


Mus musculus interferon (alpha and beta) receptor 2 (Ifnar2), mRNA j 


!ok73e02.s1 NCI CGAP_GC4 Homo sapiens cDNA clone IMAGE:1519610 3' similar to gb:K02765 I 
'COMPLEMENT C3 PRECURSOR (HUMAN); j 


1 6020831 29F1 NIH_MGC_81 Homo sapiens cDNA clone IMAGE:4247503 5' j 


Homo sapiens potassium voltage-gated channel, subfamily H (eag-related), member 4 (KCNH4), mRNA | 


Beta vulgaris mitochondrion, complete genome | 


IMMEDIATE-EARLY PROTEIN IE1 80 j 


IMMEDIATE-EARLY PROTEIN E180 j 


Homo sapiens mRNA for KIAA1 21 5 protein, partial cds | 


Homo sapiens pshsp47 gene, complete cds | 


Homo sapiens hoxl 1 proto-oncogene, exons 1 to 3 and hug-1 gene j 
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Top Hit Descriptor 


| Homo sapiens mRNA, chromosome 1 specific transcript KIAA0505 | 


| Homo sapiens sodium/iodide symporter mRNA, partial cds 


] Homo sapiens 14q32 Jagged2 gene, complete cds; and unknown gene j 


| Methanococcus jannaschii section 67 of 1 50 of the complete genome j 


601449441F1 NIH_MGC_65 Homo sapiens cDNA clone !MAGE:3853330 5' | 


H.sapiens Na+-D-glucose cotransport regulator gene | 


;Homo sapiens full length insert cDNA YH85A1 1 j 


HOMEOBOX PROTEIN GLABRA2 (HOMEOBOX-LEUCINE ZIPPER PROTEIN ATHB-10) (HD-ZIP I 
I PROTEIN ATHB-10) I 


|xp1 5b02.x1 NCI_CGAP_HN9 Homo sapiens cDNA clone IMAGE:2740395 3' similar to contains element I 
j M5R21 repetitive element ; | 


j CED-11 PROTEIN j 


j C.parasitica eapC gene | 


jQV4-EN0032-19350C-223-e03 EN0032 Homo sapiens cDNA j 


j Hnmn Ranien<- gamma-a i itamyl hydrolase gene, exons 8 and 9 and complete cds i 


] Homo sapiens hypothetical protein ASH1 (ASH1 ), mRNA | 


| Homo sapiens putative psihHbD p:cjdocenc: for hair keratin, exons 1 to 9 


| Rat SOD-2 gene for manganese-containing superoxide dismutase 


IHomo sapiens dual oxidase-like domains 2 QUOX2), mRNA | 


Saccharomyces cerevisiae Hal5p (HAL5) mRNA, complete cds 


1 
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1 
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J 

1 

! 


jPM1-CT0247-14'0&C 001 g06 CT0247 Homo sapiens cDNA 


{Andes virus strain OI23133 glycoprotein G1 and G2 precursor, gene, partial cds 


M.musculus scp2 gene exon 14 j 


Chlamydia trachomatis section 5 of 87 of the complete genome j 


Homo sapiens filamin 2 (FLN2) mRNA, complete cds j 


Arabidopsis thaliana root gravitropism control protein (PIN2) gene, complete cds j 


! Arabidopsis thaliana root gravitropism control protein (PIN2) gene, complete cds | 


Salvelinus pluvius mRNA for transferrin, complete cds | 


jSalvelinus pluvius mRNA for transferrin, complete cds | 


jPimephales promelas liver glucose-6-phosphate-1-dehydrogenasemRNA, partial cds | 


jov30a10.s1 Soares_testis_NHT Homo sapiens cDNA clone IMAGE:1643610 3' ! 


Homo sapiens Ku70-binding protein (KUB3) mRNA, partial cds j 
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Top Hit Descriptor 


! Mus musculus fructosamine 3 kinase (Fn3k), mRNA j 


Rattus norvegicus Aryf hydrocarbon receptor nuclear translocator 1 (ArnM), mRNA | 


Mus musculus pale ear (ep) gene, wild type allele, 3' region, partial cds j 


Homo sapiens lambda/iota protein kinase C-interaoting protein mRNA, complete cds • | 


Homo sapiens lambda/iota protein kinase C-interacting protein mRNA, complete cds | 


RC3-BT0502-251199-011-d01 BT0502 Homo sapiens cDNA | 


RC3-BT0502-251 199-01 1-d01 BT0502 Homo sapiens cDNA | 


Mus musculus interleukin 2 receptor, gamma chain (Il2rg), mRNA [ 


EST67784 Fetal lung II Homo sapiens cDNA 5' end j 


Sorghum bicolor 22 kDa kafirin cluster j 


Plasmodium vivax reticulocyte binding protein-2 (rbp-2) gene, complete cds j 


Homo sapiens hypothetical protein FLJ10581 (FLJ10581), mRNA | 


Sigmodon hispidus p53 gene, partial cds | 


Gallus gallus ovalbumin (Y) gene, complete cds j 


Mouse gene for immunoglobulin diversity region D1 | 


yf42f1 0.r1 Scares fetal liver spleen 1NFLS Homo sapiens cDNA clone IMAGE:129547 5' j 


Rattus norvegicus arylacetamide deacetylase gene, complete cds j 


PAIR-RULE PROTEIN ODD-PAIRED | 


Schizosaccharomyces pombe DNA for cytoplasmic dynein heavy chain, complete cds | 


CM3-CT031 5-271 199-045-M 1 CT0315 Homo sapiens cDNA [ 


MR1-FN0010-290700-007-d04 FN0010 Homo sapiens cDNA | 


Arabidopsis thaliana DNA chromosome 4, contig fragment No. 5 | 


Fugu rubripes genes encoding carbamoyl phosphate synthetase III, myosin light chain, MAP2 | 


IL3-CT0215-180200-087-D02 CT0215 Homo sapiens cDNA | 


Rattus norvegicus chemokine receptor CXCR3 mRNA, complete cds | 


KINESIN-LIKE PROTEIN KIF4 | 


Phoca vitulina partial aar2B gene for alpha adrenergic receptor 2B | 


S.mansoni elastase HP1 gene I 


ts93g12.x1 NCI_CGAP_GC6 Homo sapiens cDNA clone IMAGE:2238886 3' similar to gb:M21574 ALPHA I 
PLATELET-DERIVED GROWTH FACTOR RECEPTOR PRECURSOR (HUMAN); I 


Mus musculus Notch gene homolog 3, (Drosophila) (Notch3), mRNA j 


Xf29a07.x1 NCI_CGAP_Ut1 Homo sapiens cDNA clone IMAGE:2619444 3' similar to gb:M73779 RETlNOICl 
ACID RECEPTOR ALPHA-1 (HUMAN); | 


Homo sapiens DNA polymerase epsilon catalytb subunit protein (POLE1 ) gene, exon 1a | 
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Top Hit Descriptor 


AU1331 16 NT2RP4 Homo sapiens cDNA clone NT2RP4001328 5' | 


yg09a12.s1 Soares infant brain 1NIB Homo sapiens cDNA clone IMAGE.31663 3' similar to contains MER13 1 
j repetitive element; | 


Arabidopsis thaliana serine/threonine protein phosphatase type one (TOPP8) gene, complete cds J 


Zea mays starch branching enzyme 1 (sbel) gene, complete cds j 


Arabidopsis thaliana DNA chromosome 4, contig fragment No. 57 j 


Homo sapiens mRNA for KIAA1 198 protein, partial cds I 


Arabidopsis thaliana DNA chromosome 4, contig fragment No. 1 5 j 


Arabidopsis thaliana DNA chromosome 4, contig fragment No. 1 5 | 


Homo sapiens calcium channel alphal E subunit (CACNA1 E) gene, exons 7-49, and partial cds, alternatively I 
spliced | 


Homo sapiens partial 5-HT4 receptor gene, exons 2 to 5 j 


Drosophila melanogaster ciathrin light chain mRNA, complete cds j 


Arabidopsis thaliana receptor-like kinase LECRK1 (LECRK1 ) gene, complete cds j 


Mus musculus p1 1 6Rip mRNA, complete cds I 


Mus musculus Cctg gene for chaperonin containing TCP-1 gamma subunit, partial cds ) 


Homo sapiens calcium channel, voltage-dependent, beta 2 subunit (CACNB2) mRNA, and translated 1 
products j 


Oryzias latipes gene for membrane guanylyl cyclase OIGC1 , complete cds | 


wd71f02.x1 NCI_CGAP_Lu24 Homo sapiens cDNA done IMAGE:2337051 3' j 


Dictyostelium discoideum plasmid Ddp5, complete genome | 


Yersinia pestis plasmid pCD1 | 


Mus musculus guanylate nucleotide binding protein 1 (Gbp1), mRNA | 


Mus musculus guanylate nucleotide binding protein 1 (Gbp1), mRNA | 


Homo sapiens latent transforming growth factor beta binding protein 4 (LTBP4) mRNA | 


qg22d1 0.x5 NCl_CGAP_Kid3 Homo sapiens cDNA clone IMAGE:176181 1 3' similar to TR:075936 075936 I 
i GAMMA BUTYROBETAINE HYDROXYLASE ; | 


Mus musculus Scya6, Scya9, Scyal 6-ps, Scya5 genes for small inducible cytokine A6 precursor, small 
inducible cytokine A9 precursor, Scya16 pseudogene, small inducible cytokine A5 precursor, complete cds 


QV3-DT0018-081299-036-g04 DT0018 Homo sapiens cDNA j 


Jcnopsidium acaule LEAFY protein (LEAFY2) gene, partial cds j 
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Top Hit Descriptor 


yJ45e01 .s1 Scares placenta Nb2HP Homo sapiens cDNA clone IMAGE:151704 3' si 
repetitive element; 


yj45e01 .s1 Scares placenta Nb2HP Homo sapiens cDNA clone IMAGE:151704 3' si 
repetitive element; 


Homo sapiens Xq pseudoautosomal region; segment 1/2 


Bovine NB25 mRNA for MHC class II (BoLA-DQB), complete cds 


Arabidopsis thaliana DNA chromosome 4, contig fragment No. 56 


Mus musculus Scya6, Scya9, Scyal 6-ps, Scya5 genes for small inducible cytokine / 
inducible cytokine A9 precursor, Scya16 pseudogene, small inducible cytokine A5 pn 


S.tuberosum mRNA for alcohol dehydrogenase 


MR3-ST0203-1 51 299-1 12-g06 ST0203 Homo sapiens cDNA 


Mesocricetus auratus Na-taurocholate eotransporiing polypeptide mRNA, partial cds 


ti57e04jc1 NCI_CGAP_Lym12 Homo sapiens cDNA clone IMAGE:2134590 3' 


Homo Sapiens hisH1 gene, 5' UTR 


Arabidopsis thaliana DNA chromosome 4, contig fragment No. 90 
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Citrullus lanatus mRNA for wsus, complete cds 


Citrullus lanatus mRNA for wsus, complete cds 


Human cellular DNA/Human papillomavirus prowral DNA 


Bacteriophage Ike, complete genome 


M.musculus mRNA for P19-protein tyrosine phosphatase 


A.thaliana mRNA for ribonucleotide reductase R2 


Bacteriophage r1t integrase, repressor protein (rro), dUTPase, holin and lysin genes, 


Citrullus lanatus mRNA for wsus, complete cds 


Citrullus lanatus mRNA for wsus, complete cds 


□ictyostelium discoideum unknown (DG1041) gene, complete cds 


B.taurus mRNA for potassium channel 


Rattus norvegicus Thromboxane receptor (Tbxa2r), mRNA 


60201 9928F1 NCI_CGAP_Brn67 Homo sapiens cDNA clone IMAGE:41 55318 5' 
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Top Hit Descriptor 
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E.dispar mRNA for hexokinase (hxkl) 


6C1274604F1 NIH_MGC_20 Homo sapiens cDNA clone IMAGE:3615768 5' 


P.dumerilii histone gene cluster for core histones H2A, H2B, H3 and H4 


NEUROFILAMENT TRIPLET L PROTEIN (NEUROFILAMENT LIGHT POLYPEPTIDE) (NF 


Lvmantria dispar nucleopolvhedrovirus, complete genome 


Lymantria dispar nucleopolvhedrovirus, complete genome 


Homo sapiens BNIP3H (BNIP3H) gene, complete cds; nuclear gene for mitochondrial product 


Vibrio cholerae hypoxanthine phosphoribosyltransferase (hpt) gene, partial cds, hemagglutinin/ 
regulatory protein (hapR) gene, complete cds, and YRAL VIBCO gene, partial cds 


Vibrio cholerae hypoxanthine phosphoribosyltransferase (hpt) gene, partial cds, hemagglutinin/ 
regulatory protein (hapR) gene, complete cds, and YRAL VIBCO gene, partial cds 


EST41651 Endometrial turner Homo sapiens cDNA 5' end 


Naja naja atra ctx-1 gene, exons 1-3 


Naja naja atra ctx-1 gene, exons 1-3 


Taxus canadensis geranylgeranyl diphosphate synthase mRNA, complete cds 


Anabaena sp. ORF4 (partial), ORF3, ORF2, ORF1 , adpA gene, adpB gene, adpC gene, adpD 
gene and adpF gene 


Homo sapiens hapl gene, complete CDS 


Homo sapiens derivative 11 breakpoint fragment: partial intron 1 0 of the ALL-1 /MLUHRX gene 
5oftheAF-4/FELgene 


ScMstocerca gregaria alpha repetitive DNA 


qh57e09.x1 Soares_fetal_Jiver_spleen_1NFLS__S1 Homo sapiens cDNA clone 1MAGE:18488C 
contains OFR.bl OFR repetitive element ; 


Zea mays calcium-dependent protein kinase (MZECDPK2) mRNA, complete cds 


Zea mays starch branching enzyme lib (ae) gene, complete cds 


ne13a02.s1 NCI_CGAP_Co3 Homo sapiens cDNA clone IMAGE:881066 3' similar togb:M17 
ACIDIC RIBOSOMAL PROTEIN P1 (HUMAN); 


ne13a02.s1 NCI CGAP Co3 Homo sapiens cDNA clone IMAGE:881063 3' similar to gb:M1 7 
ACIDIC RIBOSOMAL PROTEIN P1 (HUMAN); 


ys02g06.s1 Scares fetal liver spleen 1 NFLS Homo sapiens cDNA clone IMAGE:213658 3' 


Mesocricetus auratus oviductin precursor (OVI) gene, complete cds 
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Top Hit Descriptor 


1 Fuchsia hybrid culfivar Qiu 94208 ribosomal protein S10 gene, partial cds; nucleai 
product 


(Rattus norvegicus chondroitin sulfate proteoglycan 5 (neuroglycan C) (Cspg5), ml 


! IL3-HT0519-040700-197-E05 HT0619 Homo sapiens cDNA 


IL3-HT0S19-040700-197-E05 HT0619 Homo sapiens cDNA 


i AV71 1 696 DCA Homo sapiens cDNA clone DCAADH06 5' 


l 

T 

1 
1 

1 

1 

I 


Cyprinus carpio mRNA for EGGS22 myosin heavy chain, 3'UTR 


Homo sapiens partial SLC22A2 gene for organic cation transporter (OCT2), exon 


Rattus norvegicus insulin-responsive glucose transporter (GLUT4) gene, 5' end 


xn39d11.x1 NCI_CGAP_Kid1 1 Homo sapiens oDNA clone IMAGE.2696085 3' 


Human gene for dihydrolipoamide succinyltransferase, complete cds (exon 1-1 5) 


Human gene for dihydrolipoamide succinyltransferase, complete cds (exon 1-1 5) 


Mus musculus MAP kinase kinase kinase 1 (MBkkl) mRNA, complete cds 


UI-H-BI3-akb-b-09-0-Ul.s1 NCI_CGAP_Sub5 Homo sapiens cDNA clone IMAGE 


:602083269F1 NIH_MGC_81 Homo sapiens cDNA clone IMAGE:4247537 5' 


Bos taurus factor V variant 2 (factor V) mRNA, complete cds 


oo68d05.s1 NCI_CGAP_GC4 Homo sapiens cDNA clone IMAGE:1571337 3' sin 
IRETINOL-BINDING PROTEIN I, CELLULAR (HUMAN); 


jL.stagnalis mRNA for G protein-coupled receptor 


L.stagnalis mRNA for G protein-coupled receptor 


! hh29f02.x1 NCI_CGAP_Lu24 Homo sapiens cDNA clone IMAGE:2956539 3' sim 
!MER16 repetitive element ; 


Mus musculus ICR/Swiss glyceraldehyde 3-phosphate dehydrogenase (Gapd-S) \ 


iHomo sapiens pyruvate dehydrogenase kinase, isoenzyme 1 (PDK1), nuclear gen 
protein, mRNA 


hj10f06.x1 Soares_NFL_T_GBC_S1 Homo sapiens cDNA clone IMAGE:298141 ' 


Populus trichocarpa cv. Trichobel ABI3 gene 


Populus trichocarpa cv, Trichobel ABI3 gene 


RC2-HT0149-191099-012-C09 HT0149 Homo sapiens cDNA 


B.napus mitochondrion DNA for ORF158 


Homo sapiens chromosome 21 segment HS21 C084 


6020671 92F1 NIH_MGC_57 Homo sapiens cDNA clone IMAGE:4066223 5' . 


Top Hit 
Database 
Source 


z 


z 


( EST HUMAN | 


EST HUMAN | 


:EST HUMAN I 


z 




H 




EST HUMAN I 


H 




Z 


I 

I 


EST HUMAN \ 


z 


EST HUMAN | 


z 


z 


EST HUMAN | 


t- 
z 


z 


EST HUMAN | 


z 


z 


EST HUMAN | 




z 


ESTJHUMAN | 


Top Hit Acession 
No. 


|aF287344.1 | 


1 


|BE710087.1 | 
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Top Hit Descriptor 


jS020832S9F1 NIH_MGC_81 Homo sapiens cDNA clone IMAGE:4247537 5" 


|CMO-HT0565-280200-245-b1 0 HT0565 Homo sapiens cDNA I 


; CMO-HT0565-280200-245-b10 HT0565 Homo sapiens cDNA jj 


! Arabidopsis thaliana DNA chromosome 4, contig fragment No. 60 ! 


Arabidopsis thaliana AP2 domain containing protein RAP2.12 mRNA, partial cds | 


o 
1 

O 

a. 

1 

o 

1 
o 
a: 
f 


|SEX HORMONE-BINDING GLOBULIN PRECURSOR (SHBG) (SEX STEROID-BINDING PROTEIN) j 
(SBP) (TESTIS-SPECIFIC ANDROGEN-BINDING PROTEIN) (ABP) i 


IL3-CT0219-160200-064-F10 CT021 9 Homo sapiens cDNA | 


:Mus musculus transforming growth factor alpha (TGFa) mRNA, complete cds | 


|Mus musculus transforming growth factor alpha (TGFa) mRNA, complete cds I 
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Mus musculus genomic fragment, 279 Kb, chromosome 7 | 


i 
1 

I 
< 

z 

.1 
t 

i 
I 
1 
z 

1 

1 


Homo sapiens RAD54 (S.cerevisiae)-like (RAD54L) mRNA II 


'influenza B virus (B/Nanchang/480/94) NB protein gene, complete cds; and neuraminidase gene, partial cds | 


Archaeoglobus fulgidus section 68 of 172 of the complete genome j 


Homo sapiens chromosome 5 open reading frame 3 (C50RF3), mRNA ( 


1 GLUTAMATE-CYSTEINE LIGASE REGULATORY SUBUNIT (GAMMA-GLUTAMYLCYSTEINE 1 
1 SYNTHETASE) (GAMMA-ECS) (GCS LIGHT CHAIN) '■ 


z 

a 
0 

i 


HYPOTHETICAL 51.7 KD PROTEIN IN THRC-TALB INTERGENIC REGION (ORF8) | 


;EST382376 MAGE resequences, MAGK Homo sapiens cDNA ! 


Homo sapiens HARP (HARP) gene, axon 17 and complete cds | 


I 

\ 
I 

% 
I 

X 

5 
p 

z 

* 
1 
1 


UI-HF-BN0-akk-d-05-0-Ul.r1 NIH_MGC_50 Homo sapiens cDNA clone IMAGE:3077409 5' j 


IUI-HF-BN0-akk-d-05-0-Ul.r1 NIH_MGC_50 Homo sapiens cDNA clone 1MAGE:3077409 5' ( 


iOo85g12.s1 NCI_CGAP_Kid5 Homo sapiens cDNA clone IMAGE:1573030 3' similar to gb:M26062 I 
INTERLEUKIN-2 RECEPTOR BETA CHAIN PRECURSOR (HUMAN); | 


iC1680a Clontech human aorta polyA+ mRNA (#6572) Homo sapiens cDNA done GEN-529H09 5' | 


Homo sapiens mRNA for ASK1, complete cds _| 
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Top Hit Descriptor 
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s 
I 

< 

.1 
f 

o 
z 

X 

I 


! yg97a03.r1 Scares infant brain 1 NIB Homo sapiens cDNA clone IMAGE:41 467 5' | 


jyg97a03.r1 Soares infant brain 1NIB Homo sapiens cDNA clone IMAGE:41467 5' | 
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■1 
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1 

? 
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?! 

a 
1 
1 

i 
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.1 
1 

1 
8 

i 

1 


, Thermotoga maritima section 22 of 1 36 of the complete genome ■ 


Izj50b01.s1 Soares_fetal_liver_spleen_1NFLS_S1 Homo sapiens cDNA clone IMAGE:453673 3' similar to 
jgb:X01057_rna1 INTERLEUKIN-2 RECEPTOR ALPHA CHAIN PRECURSOR (HUMAN);contains Alu 
repetitive element; 


QV3-SN0022-100500-186-h09 SN0022 Homo sapiens cDNA | 


Homo sapiens chromosome 21 segment HS21 C084 | 


Lycopersicon esculentum genomic RAPD band 26 \ 


ye1 5c1 1 .s1 Stratagene lung (#937210) Homo sapiens cDNA clone IMAGE:1 17812 3' | 


Candida tropicalis DNA for mitochondrial NADP-linked isocitrate dehydrogenase, complete cds I 


Candida tropicalis DNA for mitochondrial NADP-linked isocitrate dehydrogenase, complete cds | 


3 

s 

1 
1 

Q 

1 

1 
I 

1 

a 

1 


117147 HEMBA1 Homo sapiens cDNA clone HEMBA1000769 5' j 


17147 HEMBA1 Homo sapiens cDNA clone HEMBA1000769 5' | 


1d12jc1 Soares_NFL_T_GBC_S1 Homo sapiens cDNA clone IMAGE:2581751 3' | 


193523F1 NlH_MGC_7 Homo sapiens cDNA clone IMAGE:35375B1 5' [ 


1 -UM0036-080300-1 03-d09 UM0036 Homo sapiens cDNA j 


UI-H-BIO-aat-c-09-0-Ul.s1 NCI_CGAP_Sub1 Homo sapiens cDNA clone 1MAGE:2710289 3' | 


AV659Q47 GLC Homo sapiens cDNA clone GLCFSH06 3' | 


EST1 781 92 Colon carcinoma (HCC) cell line Homo sapiens cDNA 5' end | 


601895465F1 NIH_MGC_19 Homo sapiens cDNA clone IMAGE:4124824 5' j 


zd94a04.r1 Soares_fetal_heart_NbHH19W Homo sapiens cDNA clone IMAGE:357102 5' similar to contains I 
element KER repetitive element ; j 


Homo sapiens PHEX gene i 


Homo sapiens PHEX gene i 


Drosophila melanogaster signal transducfing adaptor protein (STAM), serine threonine kinase lal (IAL), and 
zinc finger protein (DNZ1) genes, complete cds 


yj70c05.r1 Soares breast 2NbHBst Homo sapiens cDNA clone IMAGE:154088 5' [ 


C.perfringens ORF for putative membrane transport protein | 
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Top Hit Descriptor 


Borrelia burgdorferi gIyceraIdehyde-3-phosphate dehydrogenase (GAPDH), phosphoglycerate kinase (PGK), 1 
triosephosphate isomerase (TPI) genes, complete cds 1 


M.musculus p16K gene for 16 kDa protein j 


Ephydatia fluviatilis mRNA for aldolase, partial cds j 


1 

1 

1 


Rattus norvegicus desmin (Des), mRNA | 


60131 5638F1 NIH„MGC_8 Homo sapiens cDNA clone 1MAGE:3634329 5' ! 


Fugu rubripes putative neurotransmitter receptors, YDR140W homolog, and glycinamide ribonucleotide S 
transformylase (GART) genes, complete cds j 


Synechocysfis sp. PCC6803 complete genome, 23/27, 2868767-3002965 | 


TYROSINE-PROTEIN KINASE TRANSFORMING PROTEIN ABL j 


601465575F1 NIH_MGC_67 Homo sapiens cDNA clone IMAGE:3868795 5' j 


Homo sapiens ephrin-B3 (EFNB3), mRNA j 


Mus musculus mRNA for prolidase, complete cds j 


MRO-HT0208-221299-204-C08 HT0208 Homo sapiens cDNA j 


Homo sapiens G protein-coupled receptor 50 (GPR50) mRNA | 


Homo sapiens G protein-coupled receptor 50 (GPR50) mRNA j 


Homo sapiens gene for NBS1 , complete cds j 


Human calicivirus HU/NLV/Girlington/93/UK RNA for capsid protein (ORF2), strain HU/NLV/Girlington/93/UK| 


Human calicivirus HU/NLV/Girlington/93/UK RNA for capsid protein (ORF2), strain HU/NLV/Girlington/93/UI<| 


P.dumerilii histone gene cluster for core histones H2A, H2B, H3 and H4 I 


Botrytis cinerea strain T4 cDNA library under conditions of nitrogen deprivation j 


Botrytis cinerea strain T4 cDNA library under conditions of nitrogen deprivation 1 


AV71 2467 DCA Homo sapiens cDNA clone DCAAFF05 5' ) 


Homo sapiens adapter protein CMS mRNA, complete cds i 


Botrytis cinerea strain T4 cDNA library under conditions of nitrogen deprivation j 


Rhodopseudomonas acidophila pucB5, pucA5, pucB6, pucA6, pucB7, pucA7, pucB8, pucA8 and pucC 1 
; genes and ORF1 51 | 


RC4-ST0173-191099-032-d12 ST0173 Homo sapiens cDNA j 


|Archaeoglobus fulgidus section 91 of 172 of the complete genome [' 


Carassius auratus keratin type I mRNA, complete cds | 
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Top Hit Descriptor 


Bovine branched chain alpha-keto acid dihydrolipcy iransacylase mRNA, complete cds j 


Pyrococcus horikoshii OT3 genomic DNA, 1-287000 nt. position (1/7) | 


Pyrococcus horikoshii OT3 genomic DNA, 1-287000 nt. position (1/7) | 


Homo sapiens DD4 gene for dihydrodiol dehydrogenase 4 [AKR 1 C4], exon 2 | 


Pyrococcus horikoshii OT3 genomic DNA, 1-287000 nt. position (1/7) | 


Pyrococcus horikoshii OT3 genomic DNA, 1-287000 nt. position (1/7) J 


Rattus norvegicus Fibrinogen, gamma polypeptide (Fgg), mRNA j 


Arabidopsis thaliana DNA chromosome 4, contig fragment No. 77 j 


Bacteriophage SPBc2 complete genome j 


QV3-DT0018-081299-036-a03 DT0018 Homo sapiens cDNA | 


Schistosoma mansoni fructose bisphosphate aldolase mRNA, complete cds | 


xv23f10.x1 Soares_NFL_T_GBC_S1 Homo sapiens cDNA clone IMAGE:281 3995 3' | 


AV752279 NPD Homo sapiens cDNA clone NPDAZE02 5' j 


AV752279 NPD Homo sapiens cDNA clone NPDAZE02 5' | 


Homo sapiens chromosome 21 segment HS21 COBO j 


601 126096F1 NIH_MGG_9 Homo sapiens cDNA clone IMAGE:2990063 5' | 


502154306F1 NIH_MGC_83 Homo sapiens cDNA clone IMAGE:4295544 5' | 


601510347F1 NIH_MGC_71 Homo sapiens cDNA clone IMAGE:3911987 5' | 


AU136619 PLACE1 Homo sapiens cDNA clone PLACE1 004693 5' | 


Homo sapiens chromosome 21 segment HS21 C084 | 


Homo sapiens chromosome 21 segment HS21 C0B4 | 


502154401F1 NIH_MGC_83 Homo sapiens cDNA clone IMAGE:4295305 5' j 


602154401F1 NlH_MGC_83 Homo sapiens cDNA clone IMAGE:4295305 5' ! 


QV0-UM0093-1 00400-1 89-a06 UM0093 Homo sapiens cDNA I 


Hepatitis C virus 68_CL10 genome polyprotein gene, partial cds | 


Schizosaccharomyces pombe gene for Alp41 , complete cds j 


C.jacchus intron 4 of visual pigment gene (red allele) | 


yr33d02.r1 Scares fetal liver spleen 1NFLS Homo sapiens cDNA clone IMAGE:207075 5' | 


Homo sapiens PRO061 1 protein (PRO061 1 ), mRNA | 


302187015T1 NIH_MGC_49 Homo sapiens cDNA clone IMAGE:4299074 3' | 


S.cerevisiae chromosome IV reading frame ORF YDL054c | 


Homo sapiens core histone macroH2A2.2 (MACROH2A2), mRNA j 


602187015T1 NIH_MGC_49 Homo sapiens cDNA clone IMAGE:4299074 3' j 


Oryctolagus cuniculus H+,K+-ATPase alpha 2c subunit mRNA, complete cds | 
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IAB031326.1 




| 


i 


IBF690522.1 | 


Z74102.1 | 




|BF690522.1 


'AF023129.1 | 


Most Similar 
(Top) Hit 
BLAST E 
Value 


3E-01 
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3 




3E-01 


3 


.3E-01 


3E-01 


9 


3E-01 
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9 
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2.51 1 




13.211 






1.34! 


1.28[ 


4.54| 


4.14| 


1.27| 


4.45| 


ORFSEQ 
ID NO: 


23122! 


23362i 




23368| 


8 
s 




234311 










237701 


23870I 






24110| 


1 






2471 8| 


24719| 


1 


I 


25013| 




1 


259831 




26723 | 


26744! 






27031 | 


27467| 


SEQID 
NO: 


3321| 


3575' 




3581 


3575 




I 


s 








1 


4092 


1 




4328 j 




5075 


4783 


1 


1 


1 
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1 
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SEQID 

NO: 


3404 


1 


1 


3667 




3714j 


37341 


1 


4046 1 


I 


4075 


8 
? 


41921 




42181 


4433 1 




4771 




5074 1 


s 


5173[ 


51731 


1 




1 


1 


§ 


1 


1 


1 




1 
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E 













to 






ie 1MAGE:2520977 3' similar to j 


In 


39 3' similar to gb:U05760_rna1 | 












:1460584 3' similar to TR:Q1 6671 I 
SOR. ; I 


JELL TRANSCRIPTION FACTOR | 






I 
1 


io 






58 3' similar to TR:Q14048 Q 14048 
ontains element PTR5 repetitive 




3 

is 
1 

1 




Top Hit Descriptor 


|MR4-BT0358-130700-010-h08 BT0358 Homo sapiens cDNA 


| Homo sapiens dopamine transporter (SLC6A3) gene, complete cds 


jMus musculus cofilin 2, muscle (Cfl2), mRNA 


|601158052F1 NIH_MGC_21 Homo sapiens cDNA clone IMAGE:350480 


j 601 462741 F1 NIH_MGC_67 Homo sapiens cDNA clone IMAGE:386600: 


[Gallus gallus scycl gene for lymphotacfin, exons 1-3 


Ephydatia fluviatilis mRNA for sALK-6, complete cds 


Iwu24d09.x1 Soares_Dieckgraefe_colon_NHCD Homo sapiens cDNA cloi 
|TR:O60287 060287 KIAA0539 PROTEIN. ; 


I602078440F1 NIH_MGC_62 Homo sapiens cDNA clone lMAGE:425304i 


tf39b02.x1 NCI_CGAP_Brn23 Homo sapiens cDNA clone IMAGE:20985 
ANNEXIN V (HUMAN); 


j Dictyostelium discoideum ORF DG1016 gene, partial cds 


jHomo sapiens colon cancer antigen NY-CO-45 mRNA, partial cds 


|AU149146 NT2RM4 Homo sapiens cDNA clone NT2RM4001691 3' 


JAU149146 NT2RM4 Homo sapiens cDNA clone NT2RM4001691 3' 


iAV735249 cdA Homo sapiens cDNA clone cdAAJB11 5' 


Ial48e09.s1 Soares_NFL_T_GBC_S1 Homo sapiens cDNA clone IMAGE 
|Q16671 ANTI-MULLERIAN HORMONE TYPE II RECEPTOR PRECUR 


j NUCLEAR FACTOR OF ACTIVATED T-CELLS, CYTOPLASMIC 4 (T C 
!NFAT3) (NF-ATC4) (NF-AT3) 


;qt69f09.x1 NCI_CGAP_Eso2 Homo sapiens cDNA clone IMAGE:196055 


! H.sapiens DNA for endogenous retroviral like element 


; UI-H-BI3-aki-e-10-0-Ul.s1 NCI_CGAP_Sub5 Homo sapiens cDNA clone 


! 601 821 567F1 NIH_MGC_62 Homo sapiens cDNA clone IMAGE:404622 


Homo sapiens chromosome 21 segment HS21 C01 3 


iQV3-BN0045-220300-129-f10 BN0046 Homo sapiens oDNA 


ts18g07.x1 NCI_CGAP_Pan1 Homo sapiens cDNA clone IMAGE:22289f 
COLLAGEN VI ALPHA-2 ALTERNATIVE C-TERMINAL DOMAIN. [1] ;c 
element ; 


i Human E1A enhancer binding protein (E1A-F) mRNA, partial cds 


as80c09.x1 Barstead colon HPLRB7 Homo sapiens oDNA clone IMAGE:, 
|60S RIBOSOMAL PROTEIN L30 (HUMAN); 


Human creatine kinase-B mRNA, complete cds 
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Top Hit Acession 
No. 




|AF119117.1 | 




I 


1 


1 

1 


|AB026829.1 i 


|aW001 114.1 


JBF571 764.1 j 




1 


s 

< 


i 


iAU149146.1 i 


jAV735249.1 | 


AA897474.1 | 


5 


jAI285402.1 ! 


1 


jAW449368.1 | 


!BF248490.1 | 


IAL163213.2 | 


i 

I 


AI623388.1 


IU18018.1 


|aI720470.1 


i 


Most Similar 
(Top) Hit 
BLAST E 
Value 


| 1.3E-01 1 


| 1.3E-01| 








| 1.3E-01| 


! 1.3E-01| 




! 1.3E-01| 
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SEQID 
NO: 


| 17921 1 


| 18318| 




I 




1 


1 


1 




10362| 


1 




8 




11267| 


1 






11631| 


S 


12022| 


12124| 


1 


1 




1 


12831 | 


Probe 
SEQID 
NO: 


1 


I 




| 8873 j 


1 




| 9757 [ 


| 9784| 


I 








1355 


| 1355j 


| 1361 | 


1 


| 1613I 


| 1631 [ 




| 1877| 


| 21 34 1 


| 2240 1 


1 


8 


I 2812| 


1 2872I 


8 
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Top Hit Descriptor 


Wheat mRNA for a group 3 late embryogenesis abundant protein (LEA) | 


QV1-BT0259-261099,021-d05 BT0259 Homo sapiens cDNA I 


Methanococcus jannaschii section 142 of 150 of the complete genome j 


Bacillus subtilis complete genome (section 15 of 21): from 27951 31 to 3013540 j 


W heat mRNA for a group 3 late embryogenesis abundant protein (LEA) | 


Wheat mRNA for a group 3 late embryogenesis abundant protein (LEA) | 


Bacillus subtilis complete genome (section 1 5 cf 21 ): from 27951 31 to 301 3540 j 


601 81 0786R1 NIH_MGC_46 Homo sapiens cDNA clone IMAGE:4053668 3' | 


P.clarkii mRNA; repeat region (ID 2MRT7) [ 


P.clarkii mRNA; repeat region (ID 2MRT7) | 


HEMOLYSIN PRECURSOR j 


HYPOTHETICAL 52.4 KD PROTEIN C12B10.0BC IN CHROMOSOME 1 | 


HYPOTHETICAL 52.4 KD PROTEIN C1 2B1 O DBC IN CHROMOSOME I J 


UI-HF-BKO-aah-d-01-0-Ul.r1 NIH_MGC_36 Homo sapiens cDNA clone 1MAGE:3053617 5' | 


zc08d02.r1 Soares_parathyroid_tumor_NbHPA Homo sapiens cDNA clone IMAGE:321699 5' j 


Homo sapiens gene encoding plakophilin (exons 1-13) ) 


601493518F1 NIH_MGC_70 Homo sapiens cDNA clone IMAGE:3895613 5' | 


1LO-CT0031 -221 099-1 1 3-e04 CT0031 Homo sapiens cDNA | 


Mouse galactosyltransferase mRNA, complete cds j 


PM3-BN0137-290300-002-f09 BN0137 Homo sapiens cDNA f 


wc99g03.x1 NCI_CGAP_Co3 Homo sapiens cDNA clone IMAGE:2326804 3' similar to SW:GST2_HUMAN 1 
Q99735 MICROSOMAL GLUTATHIONE S-TRANSFERASE H ; | 


xc49d07.x1 NCI_CGAP_Eso2 Homo sapiens cDNA clone 1MAGE:2587597 3' similar to gb:M13452 LAMIN A| 
I (HUMAN); | 


Staphylococcus aureus plasmid pSK23 putative recombinase Sin (sin) gene, partial cds; and transcriptional 
regulator QacR (qacR) and multidrug efflux protein QacB (qacB) genes, complete cds 


Haemophilus influenzae Rd section 29 of 1 63 of the complete genome j 


1 
1 


AV71 0857 Cu Homo sapiens cDN A clone CuAAKE08 5' j 


Yeast MPT5 gene for suppressor protein, complete cds ( 


601655578R1 NIH_MGC_65 Homo sapiens cDNA clone IMAGE:3846283 3' | 


601900763F1 NIH_MGC_19 Homo sapiens cDNA dene IMAGE:4130103 5' ( 


Homo sapiens dynein intermediate chain DNAI1 (DNAM)gene, exon 17 | 


Top Hit 
Database 
Source 


z 


EST_HUMAN 


z 


z 


E 


z 


z 


EST HUMAN 


z 


z 


SWISSPROT 


SWISSPROT 


SWISSPROT 


EST HUMAN 


EST HUMAN 




EST HUMAN j 


EST HUMAN 




ESTHUMAN I 


EST HUMAN | 


| EST HUMAN | 


z 


z 




EST HUMAN | 


z 


EST HUMAN j 


ESTJHUMAN | 


z 


Top Hit Acession 
No. 




1 


U67600.1 | 


Z99118.1 






I 


LL 


Z54255.1 






Q10441 | 


| 

5 


AW401836.1 | 


W 33035.1 ! 


1 


1 


AW845275.1 | 


M26925.1 [ 


:: ; 


AI913753.1 | 


AW083652.1 | 


8 


U32714.1 I 


X77961.1 ! 


AV71 0857.1 | 


§ 
1 


BE962324.2 | 


BF314481.1 ! 


AF190493.1 


Most Similar 
(Top) Hit 
BLAST E 
Value 
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I 
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Si 
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V 
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Si 
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ORF SEQ 
ID NO: 


1 










23199| 






| 


237671 




24807I 


1 




24947 | 




25776 1 


25824 | 






2671 5) 








27546| 










2871 5| 


SEQ ID 

NO: 


1 




13143| 


1 




1 


13350 1 




I 






15040J 


1 


1 


15174[ 














i 16772| 




1 


17340 1 
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1 




18447| 


Probe 
SEQ ID 
NO: 






| 3219 


| 3433, 


3477[ 


34771 


3554 1 


3704 1 






5012; 


51741 






5251 1 


1 


1 


I 


1 
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I 6642I 
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! 7521! 


7747 j 
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8579| 
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Top Hit Descriptor 


| Rabbit glycogen-associated protein phosphatase regulatory subunit (RG1) mRNA, 


|AV658033 GLC Homo sapiens cDNA clone GLCFIB123' 


jHomo sapiens Xq pseudoautosomal region; segment 2/2 


MACROPHAGE-STIMULATING PROTEIN RECEPTOR PRECURSOR (MSP Rl 
(CDW 1 36) (CD136 ANTIGEN) 


|Homo sapiens colon cancer antigen NY-CO-45 mRNA, partial cds 


jR.norvegicus NF68 gene for 68kDa neurofilament 


jQV4-BT0234-111199-031-g10 BT0234 Homo sapiens cDNA 


| qn20g05.x1 NCI_CGAP_Lu5 Homo sapiens cDNA clone IMAGE:1898840 3' 


jXenopus laevis integrin alpha 3 subunit mRNA, partial cds 


j CYCLIN T 


1 

s 
8 

1 

I 

I 

i 
2 

1 

1 
1 


|tn1 8d08.x1 NCLCGAP„Brn25 Homo sapiens cDNA clone IMAGE:2167983 3' 


Inm08g11.s1 NCI CGAP Co10 Homo sapiens cDNA clone IMAGE:1 059520 3' sir 
HEME OXYGENASE 1 (HUMAN); 


\ 

1 

< 
z 

■3 
1 
t 
| 

1 
§ 


j Arabidopsis thaliana DNA chromosome 4, contig fragment No. 60 


|EST384142 MAGE resequences, MAGL Homo saplenscDNA 


jSynechocystis sp. PCC6803 complete genome, 23/27, 2868767-3002965 


|AU1 40363 PLACE2 Homo sapiens cDNA clone PLACE200O4O3 5' 


]Mus musculus pre T-cell antigen receptor alpha (Ptora), mRNA 


Rattus norvegicus Procollagen II alpha 1 (Col2a1), mRNA 


1 

1 
f 


|interleukin-12 p35 subunit [mice, Genomic, 700 nt, segment 4 of 5] 


HSC1 RF022 normalized infant brain cDNA Homo sapiens cDNA clone c-1 rf02 3' 


jMus musculus calcium channel, voltage-dependent, Ttype, alpha 1G subunit(Cao 


a 
I 

1 
< 

z 

.1 
1 
I 

?, 

i 
z 

8 


1 C.reinhardtii nuclear gene on linkage group XIX 


|A.immersus gene for transposase 


jG.gallus gene encoding non-histone chromosomal protein HMG-14b, exons 4 and 


!MR3-ST0290-290100-025-g07 ST0290 Homo sapiens oDNA ' 


|MR3-ST0290-290100-025-g07 ST0290 Homo sapiens cDNA 
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Top Hit Descriptor 


Mus musculus major histocompatibility locus class III region.butyrophilin-like protein gene, partial cds; 
Notch4, PBX2, RAGE, lysophatidic acid acyi transferase-alpha, palmitoyl-protein thioesterase 2 (PPT2), 
CREB-RP, and tenascin X (TNX) genes, comple> 


Drosophila melanogaster klarsicht protein (klar) mRNA, complete cds | 


IL5-UM0070-020500-068-a08 UM0070 Homo sapiens cDNA i 


Fugu rubripes neurofibromatosis type 1 (NF1), A-kinase anchor protein (AKAP84), BAW protein (BAW), and I 
WSB1 protein (WSB1) genes, complete cds j 


Tapa-1 integral membrane protein TAPA-1 [mice, B cell lymphoma line38C13, Genomic, 1973 nt, segment 1| 
of7] | 


1 
I 

1 


Synechocystis sp. PCC6803 complete genome, 10/27, 1188886-1311234 j 


nx76a03.s1 NCl_CGAP_Ew1 Homo sapiens cDNA clone IMAGE:1268140 similar to contains Alu repetitive I 
elcnont.contEins element WER35 repetitive element ; 


S.pcnibe ste; gene encoding proiein kinase 


Providencia rettfjeri penicillin G arvcase gene 


Homo sapiens LGMD2B gene I 


PM3-FT0024-130600-004-f12 FT0024 Homo sapiens cDNA j 


RC3-CT0254-280999-011-a01 CT0254 Homo sapiens cDNA | 


AF035745 Human salivary gland cell line HSG Homo sapiens cDNA clone RL43 i 


ACETYL-COENZYME A SYNTHETASE (ACETATE-COA LIGASE) (ACYL-ACTIVATING ENZYME) j 


Homo sapiens syntaxin 4 binding protein UNC-18c (UNC-18c) mRNA, compietecds | 


Homo sapiens phosphatidylinositoi glycan, class B (PIGB), mRNA [ 


6021 40976F1 NIH_MGC_46 Homo sapiens cDNA clone IMAGE:4302Q19 5' | 


602140976F1 NIH_MGC_46 Homo sapiens cDNA clone IMAGE:4302019 5' | 


TRAB PROTEIN j 


ah31 b06.s1 Soares_parathyroid_tumor_NbHPA Homo sapiens cDNA clone 1240403 3' similar to gb: J03483 I 
CHROMOGRANIN A PRECURSOR (HUMAN); | 


nh04g10.s1 NCI_CGAP_Thy1 Homo sapiens cDNA clone 1MAGE:943362 || 


nh04g1 0.s1 NCI_CGAP_Thy1 Homo sapiens cDNA clone IMAGE:943362 | 


H, sapiens IL15 gene i 


PM1-ST0270-080200-001-f09ST0270 Homo sapiens cDNA | 


DKFZp547P194_r1 547 (synonym: hfbrl ) Homo sapiens cDNA clone DKFZp547P1 94 5' | 


Pediococcus acidilacticl H plasmid pSMB74 pediocin AcH production (pap) gene cluster papA, papB, papC J 
and papD genes, complete cds j 
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Top Hit Descriptor 


|zp93b12.r1 Stratagene muscle 937209 Homo sapiens cDNA clone !MAGE:627743 5' j 


j zp93b1 2.M Stratagene muscle 937209 Homo sapiens cDNA clone IMAGE:627743 5' j 


|yd19h03.s1 Soares fetal liver spleen 1NFLS Homo sapiens cDNA clone IMAGE:108725 3' similar to j 
|gb:M81181 SODIUM/POTASSIUM-TRANSPORTING ATPASE BETA-2 (HUMAN); [ 


j MR2-GN0027-040900-005-a08 GN0027 Homo sapiens cDNA [ 


Jyi96a09.s1 Soares placenta Nb2HP Homo sapiens cDNA clone IMAGE:147064 3' | 


I HSC1 RF022 normalized infant brain cDNA Homo sapiens cDNA clone c-1rf02 3' | 


] Carassius auratus activin beta A precursor, mRNA, complete cds I 


yh35f12.r1 Soares placenta Nb2HP Homo sapiens cDNA clone 1MAGE:131759 5' similar to contains Alu I 
J repetitive element;contains TAR1 repetitive element ; | 


] M.musculus cytokine gene | 


jZ.mobilis tgt and lig genes encoding tRNA guanine transglycosylase and DNA ligase I 


jZ.mobilis tgt and lig genes encoding tRNA guanine transglycosylase and DNA ligase j 


jSKIN SECRETORY PROTEIN XP2 PRECURSOR (APEG PROTEIN) I 


jRC2-NT01 12-1 20600-01 4-f03 NT0112 Homo sapiens cDNA j 


] 601 680551 R2 NIH_MGC_83 Homo sapiens cDNA clone IMAGE:3950604 3' j 


j601906350F1 NIH_MGC_54 Homo sapiens cDNA clone IMAGE:4134085 5' j 


j DEOXYRIBONUCLEASE II PRECURSOR (DNASE II) (ACID DNASE) (LYSOSOMAL DNASE II) | 


Iws08d01.x1 NCl_CGAP_Kid1 1 Homo sapiens teDNA clone IMAGE:2496577 3' similar to contains MER7.t3 I 
I MER7 repetitive element ; | 


j Arabidopsis thaliana DNA chromosome 4, contig fragment No. 16 j 


]UI-H-BI3-alc-d-07-0-Ul.s1 NCI_CGAP_Sub5 Homo sapiens oDNA clone IMAGE:2736420 3' \ 


j 601 456301 F1 NIH_MGC_66 Homo sapiens cDNA clone IMAGE:3859849 5' j 


j601 906489F1 NIH_MGC_54 Homo sapiens cDNA clone IMAGE:4134071 5' j 


j QV2-NT0048-1 60800-31 6-e05 NT0048 Homo sapiens oDNA | 


jChlamydophila pneumoniae AR39, section 91 cf 94 of the complete genome | 


jan32c04.y5 Gessler Wilms tumor Homo sapiens cDNA clone IMAGE:1700358 5' | 


j Drosophila melanogaster tyrosine kinase p45 isoform (fer) mRNA, complete cds j 


EST36441 4 MAGE resequences, MAGB Homo sapiens cDNA [ 


j Mouse FTZ-F1 gene | 


|UI-H-BW1-aoa-e-12-0-Ul.s1 NCI_CGAP_Sub7 Homo sapiens cDNA clone IMAGE:3084023 3' | 


jzh62h04.s1 Scares JetalJiver_spleen_1NFLS_S1 Homo sapiens cDNA clone IMAGE:416695 3' | 


jHomo sapiens growth factor receptor-bound protein 7 (GRB7) gene, complete cds | 


Top Hit 
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|EST HUMAN 


1 


| EST HUMAN 


Jest human 


| EST HUMAN 


jEST HUMAN 


INT | 


EST HUMAN | 


INT 


INT | 


!NT 


| SWISSPROT 


EST HUMAN 


jEST HUMAN 


! EST HUMAN I 


j SWISSPROT 


■EST HUMAN I 


INT I 


jEST HUMAN j 


jEST HUMAN 


lEST HUMAN ! 


jEST HUMAN 


'NT | 


|EST HUMAN j 


INT | 


EST HUMAN j 
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EST HUMAN I 


EST HUMAN 


! IN, 


Top Hit Acession 
No. 


(AA192153.1 i 


1 


|t72675.1 


1 


1 
1 


|F03265.1 j 


|AF1 69032.1 i 


|r23708.1 I 


1 








|BE757023.1 ! 


|BE974556.1 j 


Sj 


l 

O 


S 


|AL161504.2 


1 


|BF033991.1 i 


I 


|BF365703.1 ! 


|AE002265.2 | 


|AI792349.1 i 


|U5D450.1 j 


|AW952344.1 | 


|D49S83.1 j 


| BF51 5935.1 | 


1 


|AF274875.1 | 


Most Similar 
(Top) Hit 
BLAST E 
Value 


.1E-01 


V 


.1E-01 


.1E-01 


.1E-01 


.1E-01 


.1E-01 
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•OE-01 


1O-30' 


.OE-01 


■OE-01 






























































Expression 
Signal 


I 2.24I 




| 2.48 










Sj 


s 










1 




q 


s 


Sj 








2.44| 






$ 




i 0.87I 


? 






ORF SEQ 

ID NO: 


| 27294| 
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j 2271 7[ 
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251 92 1 
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1 2925 i 
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Top Hit Descriptor 


| Trimeresurus flavoviridis DNA for phospholipase A2 inhibitor, complete cds j 


| Arabidopsis thaliana DNA chromosome 4, contig fragment No. 38 | 


|601453642F1 NIH_MGC_66 Homo sapiens cDNA clone IMAGE:3857243 5' J 


|601453642F1 NIH„MGC_66 Homo sapiens cDNA clone 1MAGE:3857243 5' j 


|601453642F1 NIH_MGC_66 Homo sapiens cDNA clone IMAGE:3857243 5' | 


|601453642F1 NIH_MGC_66 HomosapienscDNAclone IMAGE:3857243 5' | 


|602150882F1 NIHJvlGC 81 HomosapienscDNAclone IMAGE:4291917 5' | 


| Cavia porcellus 3beta-hydroxysteroid sulfotransferase mRNA, complete cds j 


j Cavia porcellus 3beta-hydroxysteroid sulfotransferase mRNA, complete cds | 


| M.capricolum DNA for CONTIG MC073 | 


| Mus musculus coding region determinant-binding protein (Crdbp), mRNA j 


[Acinetobacter sp. cysD, cobQ, sodM, lysS, rubA, rubB, estB, oxyR, ppk, mtgA, ORF2 and ORF3 genes | 


1 Human BRCA1 , Rho7 and vati genes, complete cds, and ipf35 gene, partial cds j 


jRattus norvegicus calcium channel alpha-1C subunit (ROB2) mRNA, partial cds 


1 Human pephBGT-1 betaine-GABA transporter mRNA, complete cds j 


Homo sapiens BAI1 -associated protein 3 (BAIAP3) mRNA \ 


| Homo sapiens nasopharyngeal epithelium specific protein 1 (NESG1 ), mRNA [ 


I602133086F1 NIH MGC 81 Homo sapiens cDNA done IMAGE:4288269 5' I 


I601285082F1 NIH_MGC_44 Homo sapiens cDNA clone 1MAGE:3607B53 5' j 


6012S6082F1 NIH_MGC_44 Homo sapiens cDNA done IMAGE;3607653 5' j 


IAV732224 HTF Homo sapiens cDNA clone HTFAUA06 5' | 


|601655988R1 NIH_MGC_66 Homo sapiens cDNA clone IMAGE:3855981 3' j 


HYPOTHETICAL PROTEIN KIAA0032 


HYPOTHETICAL PROTEIN KIAA0032 i 


'UI-H-BI1-afx-h-05-0-Ul.s1 NCI_CGAP_Sub3 Homo sapiens cDNA clone IMAGE:2723553 3' \ 


Photobacterium damselae subsp. damselae partial gyrB gene for DNA gyrase B subunit 1 


hd28h 1 2.X1 Soares_NFL_T_GBC_S1 Homo sapiens cDNA clone IMAGE:291 0887 3' I 


Mus musculus major histocompatibility locus class II region; Fas-binding protein Daxx(DAXX) gene, partial 
cds; Bingl (BING1), tapasin (tapasin), RalGDS-like factor (RLF), KE2 (KE2), BING4 (BING4), betal, 3- 
galactosyl transferase (betal ,3-galactosyl tr> 


Molluscum contagiosum virus subtype 1 , complete genome | 


Molluscum contagiosum virus subtype 1 , complete genome | 


Molluscum contagiosum virus subtype 1 , complete genome | 


Top Hit 
Database 


z 


z 


|EST HUMAN 


i 


I 

B 


|EST HUMAN 


I 
1 




z 




z 


z 


z 


z 


z 


z 


z 


EST HUMAN I 


I EST HUMAN | 


? 
2 

X 

co 


EST HUMAN \ 


EST HUMAN j 


SWISSPROT | 


SWISSPROT [ 


EST_HUMAN | 




EST HUMAN | 




z 


z 


z 


Top HitAcession 
No. 


|AE003473.1 


|AL161 538.2 


|BF035861.1 




I 
1 


1 
1 


|BF671063.1 


|U55944.1 


i 

1 


|Z33059.1 


I 675351 7| 






jU31815.1 ! 


IU27699.1 j 


§ 

s 


8 
§ 


|BF575511.1 | 
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1 


AF1 00956.1 


U60315.1 | 


U60315.1 I 


U60315.1 | 
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Top Hit Descriptor 


J Plasmodium falciparum P-type ATPase 3 gene | 


j REGULATORY PROTEIN ZESTE j 


Iza68a12.r1 Soares_fetal_lung_NbHL19W Homo sapiens cDNA clone 1MAGE:297694 5' similar to j 
]PIR:S52171 S52171 small G protein - human ; j 


|Homo sapiens chromosome 16 open reading frame 5 (C16orf5), mRNA | 


!602i29030F2NIH_MGC_56HomosapienscDNAclonelMAGE:4285951 5' | 


to 

C5 
1 

1 

z 

Q 

1 
| 

X 

8 

0 

I 

X 
H 

1 

1 


jPM0-HTO339-251199-003-d01 HT0339 Homo sapiens cDNA j 


! Atrichum angusfafum AtranFlo2 protein (AfranFlo2)gene, partial cds ] 


1 

5 
1 
Z 

"3 
1 
| 

$ 
1 

1 
1 


UI-H-BI3-alo-f-08-0-Ul.s1 NCI_CGAP_Sub5 Homo sapiens cDNA clone IMAGE:3068294 3' j 


|UI-H-Bl3-alo-f-08-0-Ul.s1 NCLCGAP_Sub5 Homo sapiens cDNA clone IMAGE:3068294 3' j 


Homo sapiens similar to endoglycan (H. sapiens) (LOC63107), mRNA I 


1 FOLD BIFUNCTIONAL PROTEIN [INCLUDES: METHYLENETETRAHYDROFOLATE j 
! DEHYDROGENASE ; METHENYLTETRAHYDROFOLATE CYCLOHYDROLASE ] 


jH.sapiens flow-sorted chromosome 6 Hindlll fragment, SC6pA20F8 | 


EST1 80187 Liver, hepatocellular carcinoma Homo sapiens cDNA 5' end I 


|602129682F1 NIH_MGC_56 Homo sapiens cDNA clone IMAGE:4286180 5' j 


PROBABLE DNA LIGASE (POLYDEOXYRIBONUCLEOTIDE SYNTHASE [ATP]) ; 


| EST1 1 595 Uterus Homo sapiens cDNA 5' end | 


j TRANSCRIPTION INITIATION FACTOR TFIID 135 KDA SUBUNIT (TAFII-135) (TAFII135) (TAFII-130) I 

(TAFII130) j 


Homo sapiens paired box gene 6 (aniridia, keratitis) (PAX6), isoform b, mRNA | 


! zn99a05.s1 Stratagene colon (#937204) Homo sapiens cDNA clone IMAGE:566288 3' I 


I601191770F1 NIH_MGC_7 Homo sapiens cDNA clone IMAGE:3535648 5' | 


601191770F1 NIH_MGC_7 Homo sapiens cDNA clone IMAGE:3535648 5' | 


iDKFZp434D1313_r1 434 (synonym: htes3) Homo sapiens cDNA clone DKFZp434D1313 5' | 


S.cerevisiae chromosome XIV reading frame ORF YNL285W j 


Homo sapiens zinc finger protein 92 (ZFP92), expressed-Xq28STS protein (XQ280RF), and biglycan (BGN) 
genes, complete cds; and plasma membrane calcium ATPase isoform 3 (PMCA3) gene, partial cds 


1 

Homo sapiens zinc finger protein 92 (ZFP92), expressed-Xq28STS protein (XQ280RF), and biglycan (BGN) 
j genes, complete cds; and plasma membrane calcium ATPase isoform 3 (PMCA3) gene, partial cds 
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Top Hit Descriptor 


zh68a02.r1 SoaresJetalJiver_spleen_1NFLS_S1 Homo sapiens cDNA clone IMAGE:417194 5' similar to I 
|contains element MER12 repetitive element ; j 


Mus musculus JNK interacting protein-3a (Jip3) mRNA, complete cds > 


Methanobacterium thermoautotrophicum from bases 1175181 to 1189406 (section 101 of 148) of the j 
| complete genome j 


zs55g08.s1 NCI_CGAP_GCB1 Homo sapiens cDNA clone IMAGE:701438 3' | 


zs55g08.s1 NCI_CGAP_GCB1 Homo sapiens cDNA done 1MAGE:701438 3' | 


Oryctolagus cuniculus cytochrome P-450 (CYP4A4) gene, 5' end | 


Gluconobacter oxydans fRNA-lle and tRNA-Ala genes I 


Human DNA for immunoglobulin alpha heavy chain from a case of alpha heavy chain disease j 


Mus musculus nidogen 2 (Nid2), mRNA j 


Homo sapiens Xq pseudoautosomal region; segment 2/2 | 


Trichomonas vaginalis beta-tubulin (btubl) gene, complete cds j 


□ictyostelium discoideum adenylyt cyclase (acrA) gene, complete cds j 


Oryctolagus cuniculus galectin-3 gene, untranslated exon and 5' flanking region 1 


Homo sapiens LCN1b gene j 


Mouse germline IgM chain gene, D region; D-q52, mu switch region (part a) | 


Mouse germline IgM chain gene, D region; D-q52, mu switch region (part a) i 


Homo sapiens Snf2-related CBP activator protein (SRCAP) mRNA ( 


0 c a; latec C n p - 1 i 0 tin (SRCAP) mRNA I 


| Lacerta media cytochrome c oxidase subunit 1 gene, partial cds; mitochondrial gene for mitochondrial product | 


i Lacerta media cytochrome c oxidase subunit 1 gene, partial cds; mitochondrial gene for mitochondrial product | 


601893437F1 NIH_MGC_17 Homo sapiens cDNA done 1MAGE:413921 6 5' | 


601893437F1 N1H_MGC_17 Homo sapiens cDNA done IMAGE:4139216 5' j 


Archaeoglobus fulgidus section 34 of 172 of the complete genome || 


Bacillus stearothermophilus BsrFI methylase (FIM) and BsrFI restriction endonuclease (FIR) genes, complete 1 
cds | 


Helicobacter pylori 26695 section 130 of 134 of the complete genome j 


M PROTEIN, SEROTYPE 6 PRECURSOR | 


Mus musculus phospholipase C-like protein mRNA, partial cds | 


Mus musculus myosin XV (Myo1 5), mRNA j 
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! 

1 


\ 


j Mus musculus zinc transporter (ZnT-3) gene, complete cds | 


I601439573F1 NIH_MGC_72 Homo sapiens cDNA clone IMAGE:3924523 5' j 


! Bos taurus connective tissue growth factor precursor (CTGF) gene, complete cds | 


|RC2-PT0004-03 1299-01 1-d05 PT0004 Homo sapiens cDNA | 


i Beet necrotic yellow vein virus RNA-2 j 


i601115055F1 NIH_MGC__16 Homo sapiens cDNA clone IMAGE:335559B 5' | 


i Chlamydophila pneumoniae AR39, section 73 of 94 of the complete genome | 


j Mus musculus epidermal growth factor receptor (Egfr) gene, exons 5 through 28, and complete cds, ' 
(alternatively spliced j 


! Pseudomonas putida malonate decarboxylase gene cluster (mdcA, mdcB, mdcC, mdcD, mdcE, mdcG, j 
[ mdcH , mdcL and mdcM genes), complete cds | 


iHomo sapiens extracellular glycoprotein lacritin precursor, gene, complete cds | 


Homo sapiens chromosome 21 segment HS21 C002 j 


EST366723 MAGE resequences, MAGC Homo sapiens cDNA ! 


Molluscum contagiosum virus subtype 1, complete genome I 


, complete cds (exon 1-15) | 


complete cds (exon 1-15) | 


scDNA | 


', 2137259-2267259 | 


', 2137259-2267259 I 


1 
O 

s 

I 


Dictyoselium discoideum cyclic nucleotide phosphodiesterase gene, complete cds \ 


Thermoplasma aoidophilum complete genome; segment 5/5 j 


;EST378191 MAGE resequences, MAGI Homo sapiens cDNA | 


Homo sapiens cAMP responsive element binding protein-like 2 (CREBL2) mRNA i 


M.musculus gene for gelatinase B | 


Herpesvirus saimiri transformation-associated protein (STP), and dihydrofdate reductase (DHFR) gene.s 1 
icomplete cds, and small nuclear RNAs (uRNAs) | 


Homo sapiens ABCA1 (ABCA1 ) gene, complete cds j 


Homo sapiens ABCA1 (ABCA1 ) gene, complete cds | 


Botrytis cinerea strain T4 oDNA library under conditions of nitrogen deprivation | 


I 

< 

1 
1 

I 


H , sapiens AGT gene, intron 4 j 


Human gene for dihydrolipoamide succinyltransferase 


2 
1 

1 
| 

| 

1 
1 
X 


PM3-BT0347-1 70200-001-b08 BT0347 Homo sapien 


| Synechocystis sp. PCC6803 complete genome, 17/2/ 


Synechocystis sp. PCC6803 complete genome, 17/27 


601855548F1 NIH MGC_57 Homo sapiens cDNA cli 
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' 8.0E-02| 


j 8.0E-02| 


i 8.OE-O2I 


! 8.0E-02| 


! 8.0E-02| 


| 8.0E-02| 


i 8.0E-02| 


8.0E-02| 


1 

i 


I 


| 2.82| 




i 3.14| 


! 3.13| 






j 4.13 1 








1 


CO 




10.54| 


2 








§ 






0.84| 


0 


1 


0.87] 




s 


3.65 1 






ORFSEQ 


ID NO: 


| 24655 


1 


1 


| 27164 


s 
is 




] 25336 




21247 










21440| 


21441 | 


8 


1 


22104| 




20833 1 




8 






1 




25581 | 






27472| 


SEQ ID 
NO: 




s 


1 S073| 


1 




1 7479 


1 0006 1 


I 


11384 


1 7553| 






§ 


12701 


12701 




12204 


12204 


I 


1 




1 


!3337| 








1 


1 


1 


1 7238 1 






























































Probe 
SEQ ID 
NO: 


| 5014 


| 5271 


I 






| 7628 


| 9314 


1 




| 7703; 








1671| 


| 1671! 


1 


1 2323 | 


| 23231 


| 2421 | 


1 2790 j 


I 2870 1 


| 3745' 


| 3980 






g 






1 
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Top Hit Descriptor 


IHomo sapiens SCG10 like-protein, helicase-like protein NHL, MSB, and ADP-ribosylation factor related I 
■protein 1 (ARFRP1) genes, complete cds j 


! Drosophila orena hunchback region | 


|Homo sapiens oAMP responsive element binding protein-like 2 (CREBL2) mRNA ] 


jMus musculus Ranbp7 gene, Staf gene and Wee1 gene | 


|600943191F1 NIH_MGC_15 Homo sapiens cDNA clone IMAGE:295951 0 S j 


Iar98c08.x1 Barstead colon HPLRB7 Homo sapiens cDNA clone IMAGE:2173646 3' similar to gb:Z2687S I 
|60S RIBOSOMAL PROTEIN L38 (HUMAN); | 


Mus musculus colony stimulating factor 1 receptor (Csfl r), mRNA | 


IMus musculus colony stimulating factor 1 receptor (Csfl r), mRNA 1 


1 

1 
< 
Q 

1 
I 

Z 


'Arabidopsis thaliana RXW24L mRNA, partial cds | 


Saccharomyces cerevisiae suppressor of MIF2 Smt4p (SMT4) gene, complete cds | 


!ou63b05.s1 NCI_CGAP_Br2 Homo sapiens cDNA clone 1MAGE:1632465 3' similar to W P:C37A2.2 I 
:CE08611 ; j 


ou63fc>05.s1 NCI_CGAP_Br2 Homo sapiens cDNA clone IMAG&1632465 3' similar to WP:C37A2.2 | 
CE08611; j 


oo59d02.y5 NCI_CGAP_Lu5 Homo sapiens cDNA clone IMAGE:1570467 5' similar to contains L1 .t3 L1 | 


;oo53d02.y5 NCI_CGAP_Lu5 Homo sapiens cDNA clone IMAGE:1570467 5' similar to contains L1.t3 L1 I 
; repetitive element; | 


PM3-FN0058-140700-005-f09 FN0058 Homo sapiens cDNA | 


I 

s 

I 
1 

< 
Q 

1 
t 
| 

j 


fg48g12.x1 Soares_NFL_T_GBC_S1 Homo sapiens cDNA clone IMAGE:21 12070 3' similar to contains I 
MER1 0,t3 MER1 0 repetitive element ; j 


Homo sapiens FYVE domain-containing dual specificity protein phosphatase FYVE-DSP1b mRNA, complete I 
cds j 


iHcmo sapiens FYVE domain-containing dual specificity protein phosphatase FYVE-DSP1b mRNA, complete | 


ncS8b06.r1 NCI_CGAP_Pr1 Homo sapiens cDNA clone IMAGE:771731 [ 


IHomo sapiens WRN (WRN) gene, complete cds | 


Hono sapiens partial AF-4 gene, exons 2 to 7 and Aiu repeat elements | 


zj53d1 1 M Soares ovary tumor NbHOT Homo sapiens oDNA clone IMAGE:741717 5' similar to [ 
!TR:G1 173905 G1173905 SPLICEOSOME ASSOCIATED PROTEIN. ; | 


Top Hit 
Database 
Source 


z 


z 






|EST_HUMA N I 


|eST HUMAN | 


z 


z 


|EST human j 


z 


z 


EST HUMAN 


EST HUMAN | 


EST HUMAN I 


I, 


lEST HUMAN | 


? 

5, 


EST HUMAN | 


z 


z 


|EST HUMAN | 




z 


|eST HUMAN | 


Top Hit Acession 
No. 


AF21 7796.1 


|AJ005375.1 | 


| 4503034| 


IAJ278435.1 




AI582029.1 ! 


| 


I 
1 


IBF348454.1 J 


1 


jU27832.1 I 


1 


i 

< 


AI793275.1 


< 




;BE250048.1 | 


1 
3 


AF233437.1 j 


AF233437.1 


1 
1 


1 


1 
1 


s 


Most Similar 
(Top) Hit 
BLAST E 
Value 




| 8.0E-02| 


| 8.0E-02| 


s 

2 


i 7.9E-02| 


I 7.9E-oJ 


i 7.9E-02| 


1 7.9E-02I 


7.9E-02 


| 7.9E-02| 


| 7.9E-02| 


7.9E-02| 


| 7.9E-02J 


7.8E-O2I 


I 7.8E-02I 


? 


j 7.8E-02| 


s 


7.8E-O2I 


7.8E-O2I 


s 


j 7.7E-02| 




7.7E-02I 










1 




1 


« 




P 




Sn 


1 


1 














8 










ORFSEQ 
ID NO: 


| 28310* 


| 








22673| 


1 


234721 








27S3si 


27839| 


1 


1 






1 


1 


27307| 




! 21139| 




26682I 


SEQ ID 

NO: 




| 19023 1 


| 13887| 


1 


8 


12875| 


1 


1 


1 




16562| 




17612| 






1 


13603 1 










1 


1 


1 


SEQID 
NO: 


| 8172 


! 


1 


§ 

8? 




1 


| 3776| 


1 37761 


! 


1 


1 


1 


| 7762| 


| 1192| 


I 1192I 


! 


| 501 9| 




1 


| 723e| 




| 1378( 
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Top Hit Descriptor 


jPROBABLE SERINE/THREONINE-PROTEIN KINASE YBR059C | 


|Homo sapiens KIAA0628 gene product (KIAA0628), mRNA | 


|Homo sapiens interferon regulatory factor 7 (IRF7), mRNA | 


1601316426F1 NIH_MGC_8 Homo sapiens cDNA clone IWIAG&3634903 5' | 


| EST1 1 2214 Cerebellum II Homo sapiens cDNA 5' end similar to similar to protocadherin 43 | 


jHomo sapiens ASCL3 gene, CEGP1 gene, C1 1 orfl 4 gene, C11orf15 gene, C11orf16geneand C11orf17 I 


|RC3-CT0347-1 1 0300-01 4-a05 CT0347 Homo sapiens cDNA | 


;Homo sapiens SCL gene locus | 


[Campylobacter jejuni NCTC1 1 168 complete genome; segment 5/6 j 


1 

1 

1 
I 

z 

i 
I 

0 


Ihoitio sapiens solute carrier family 6 (neurotransmitter transporter, glycine), member 9 (SLC6A9), mRNA | 


SHomo sapiens solute carrier family 6 (neurotransmitter transporter, glycine), member 9 (SLC6A9), mRNA | 


jHomo sapiens chromosome 21 segment HS21 C078 i 


Homo sapiens IL-18 gene for interleukin-18, intron 1 and exon 2 I 


Iwl52b02.x1 NCI_CGAP_Brn25 Homo sapiens cDNA clone IMAGE:2428491 3' similar to gb:Ml4328 ALPHA I 
IENOLASE (HUMAN); | 


|AU116913HEMBA1 Homo sapiens cDN A clone HEMBA1 000264 5' | 


i RC5-LT0054-260100-01 1-H09 LT0054 Homo sapiens cDNA | 


!Equine herpesvirus 4 strain NS80567, complete genome 1 


,Mus musculus peired-like homeodomain transcription factor 1 (Pitxl), mRNA | 


,wf43h01 .x1 Soares NFL_T GBC_S1 Homo sapiens cDNA clone IMAGE:2358385 3' I 


Homo sapiens ADP/ATP carrier protein (ANT-2) gene, complete cds | 


Rattus norvegicus Activin receptor like kinase 1 (Acvrll), mRNA j 


Mus musculus ubiquinfin c-terminal hydrolase related polypeptide (Uchrp), mRNA | 


yg1 4g06.r1 Soares infant brain 1 NIB Homo sapiens cDNA clone IMAGE:32339 5' j 


to 

O 
2 

1 
< 
a 

1 
1 
| 
8, 

CD 
X 

1 


Human periodic tryptophan protein 2 (PWP2) gene, exons 15 to 21 , and complete cds I 


Homo sapiens histone deacetylase 5 (NY-CO-9), mRNA j 


: CM4-HT0243-081 1 99-037-d1 1 HT0243 Homo sapiens cDNA j 


;601453813F1 NIH_MGC_66 Homo sapiens cDNA clone IMAGE:3857738 5' | 


601658738R1 NIH_MGC_69 Homo sapiens cDNA clone IMAGE:3886209 3' j 


Top Hit 
Database 


|SWISSPROT ! 


^T | 


!NT f 


lEST HUMAN i 


jEST HUMAN 


|NT 


z 

i 


;nt 1 


Z 


(EST HUMAN ! 


I INj 


|nt 


z 


INT I 


ESTJHUMAN | 


!est_human ! 


I EST HUMAN ! 


z 


[NT 


lEST HUMAN I 


INT | 


z 


1 1N| 


I 


!EST HUMAN | 


NT 


'NT | 


'ESTJHUMAN | 


'ESTJHUMAN | 


ESTJHUMAN | 


Top Hit Acession 
No. 


1 


| 114227571 


1 


I 


I 


IAJ400877.1 


iAW858844.1 


1AJ131016.1 ! 


IAL139078.2 ] 


I 
1 
| 


| 5902093 


1 


; AL1 63278.2 i 




< 


AU116913.1 | 


1 


E 

< 




1 
< 


;L78810.1 i 


1 

3 


a 
1 


£ 




i 
1 


11 525893 | 


I 
1 


BF035099.1 | 


1 

a 


Most Similar 
(Top) Hit 
BLAST E 
Value 


| 7.7E-02| 


| 77E-02| 


9 


s 


| 7.6E-02| 


I 7.6E-O2I 


| 7.6E-02| 


I 


| 7.6E-02| 


| 7.6E-02| 


| 7.5E-02| 


7.5E-02I 


s 


| 7.5E-02| 


| 7.5E-02I 


| 7.5E-02| 


| 7.4E-02| 


s 


| 7.4E-02| 


1 


| 7.4E-02| 


3 




! 7.4E-02| 


j 7.4E-02| 


j 7.4E-02] 


! 7.4E-02| 


! 7.4E-02| 


! 7.4E-02| 


! 7.3E-02| 


f _ 
1 1 




I 5.24, 




2.57; 


I 0.94, 


1 


i 














1 










1 


1 








I 1-751 






| 2.08| 






1 0.96| 


ORFSEQ 
ID NO: 


| 27736 


| 28504 






23081, 


1 




27518| 




29028 1 


20535 1 


20536 | 


1 


24085| 


1 




2023 1| 






23255| 




















20222I 


c e .. 


17510' 




1 




13281 


13428 


§ 




I 


18735 








14301 




16831 






12410 


1 


§ 




1 


15819 


1 


i 


1 






1040s) 
































































Probe 
SEQID 
NO: 






| 9535 


| 3341 




| 3512-1 


1 


1 


§ 










| 4407| 


1 




S 


1 


1 


1 35451 


1 


1 4687| 




| 5913| 


| 66121 


1 


| 9271 j 


1 95271 




9 
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Top Hit Descriptor 


601658738R1 NIH_MGC_69 Homo sapiens cDNA clone !MAGE:3886209 3' i 


Thermotoga marilima section 101 of 136 of the complete genome ] 


CMO-NN1004-130300-284-g08 NN1004 Homo sapiens cDNA | 


Homo sapiens chromosome 21 segment HS21C102 j 


Mus musculus transcription factor USF2 (USF2) gene, exons 8-1 0 and complete cds j 


PROLINE-RICH PROTEIN MP-3 j 


PROLINE-RICH PROTEIN MP-3 ! 


Homo sapiens KIAA0424 protein (KIAA0424), mRNA I 


zj24a02.s1 Soares_fetalJi\/er_spleen_1NFLS_S1 Homo sapiens cDNA clone IMAGE:451 178 3' similar to I 
gb:L02426 26S PROTEASE SUBUNIT 4 (HUMAN); j 


Methanobacterium thermoautotrophicum from bases 10291 55 to 1039934 (section 88 of 148) of the complete | 


Methanobacterium thermoautotrophicum from bases 10291 55 to 1039934 (section 88 of 148) of the complete] 
genome \ 


Homo sapiens chromosome 21 segment HS21C101 | 


Homo sapiens chromosome 21 segment HS21C101 | 


Human immunodeficiency virus type 1 isolate 26 reverse transcriptase (pol) gene, internal fragment, partial . j 


I 

S3 

I 

§ 
I 

1 
1 

1 
1 

1 
z 
5 

i 

1 

1 


602077757F1 NIH_MGC_62 Homo sapiens cDNA clone IMAGE:4251 950 5' | 


Rhodomonas salina mitochondrion, complete genome J 


Streptococcus mutans gene for glucose-1 -phosphate uridylyltransferase, complete cds | 


Methanococcus jannaschii section 73 of 1 50 of the complete genome | 


CALMODULIN \ 


601883558F1 NIH_MGC57 Homo sapiens cDNA clone IMAGE:409571 0 5' | 


Strongylocentrotus purpuratus mitochondrion, complete genome [ 


AV71 2452 DCA Homo sapiens cDNA clone DCAAUG01 5' I 


Homo sapiens plasma membrane calcium ATPase isoform 1 (ATP2B1 ) gene, alternative splice products, I 
partial cds | 


hq24f11.x1 NCI CGAP Adii Homo sapiens cDNA clone IMAGE:3120333 3' similar to TR:Q9Z340 Q9Z340 I 
ATYPICAL PKC SPECIFIC BINDING PROTEIN. ; j 


j 
; 

: 

\ 
| 

i 


genes, complete cds; and plasma membrane calcium ATPase isoform 3 (PMCA3) gene, partial cds \ 


Top Hit 
Database 


EST HUMAN 


z 


EST HUMAN , 


NT 


NT ! 


SWISSPROT 


SWISSPROT 


NT 


1 

=> 

i 


NT 


NT ! 


NT ] 


NT I 




z 

5 

X 


EST HUMAN \ 


NT i 




INJ J 


SWISSPROT | 


1 

3 

X 


z 


EST HUMAN | 


NT ! 


EST HUMAN | 


NT 


TopHitAcession 
No. 






AW900281.1 


AL1 63302.2 


U12283.1 




P05143 


7662107 


1 


AE000882.1 


AE000882.1 


AL163301.2 


AL163301.2 


U14794.1 


AW298322.1 | 


j 


§ 
1 


1 


U67531.1 








AV712452.1 | 


L14561.1 | 


AW873187.1 | 


U82695.2 


Most Similar 
(Top) Hit 
BLAST E 
Value 


7.3E-02 


7.3E-02 


7.3E-02i 


7.3E-02, 


7.3E-02 


S 


7.3E-02| 


7.3E-02| 


. 7.3E-02I 


7.2E-02! 




7.2E-02| 


7.2E-02| 




7.2E-02| 


7.2E-02| 


i 


s 




7.2E-02| 


7.2E-02| 


7.2E-02| 


7.2E-02| 


7.2E-02| 


7.2E-02I 


7.2E-02 


ll 


0.96j 




















1 


2.23 1 




oi 






78.82| 


1 


3^31 




9.33 1 


175] 


2.05J 


4.23) 






ORFSEQ 
ID NO: 


20223' 


1 


21237| 






1 


26438 | 




1 




1 


21 228 | 






a 


S3 


24268 | 


1 


24922] 




26272 | 






1 


2781 8 | 


28016 


SEQ ID 
NO: 


10405 








1 


s 


16275 | 


8 




1 


1 


11365 1 


1 


a 






! 




15154) 




16118| 




I 


i74re| 


I 


17776 


Probe 
SEQ ID 
NO: 


i 


1 


! 




1 


S 


6413 










1460 1 


! 


2502I 


3810| 


! 




1 


5230 1 


5231 1 


s 


s 


K 




774o| 


7926 
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.14837 | 






























I 

1 














R 
















2 
'E 




I products 
















to 


Q 








I 














VGE:27160 


1 














27184 3' sit 






| 


Q 




Top Hit Descriptor 


|601343926F1 NIH_MGC_53 Homo sapiens cDNA clone IMAGE:3685951 5' 


|601C65194F1 NIH_MGC_10 Homo sapiens cDNA clone IMAGE:3451559 5' 


(Rattus norvegicus bHLH transcription factor Misti (Misti) gene, complete cdi 


|af81a04.r1 Soares_NhHMPu_S1 Homo sapiens cDNA clone IMAGE:104839 


! AJ230796 Homo sapiens library (Seranski P) Homo sapiens cDNA clone PS1 


|no05h08.s1 NCI_CGAP_Phe1 Homo sapiens cDNA clone IMAGE:1099839 : 


IHomo sapiens ataxia telangiectasia (ATM) gene, complete cds 


jCM4-NN1 009-200300-1 16-C11 NN1009 Homo sapiens cDNA 


Human immunodeficiency virus type 1 (D9) proviral structural capsid protein ( 


jPseudomonas aeruginosa PA01 , section 451 of 529 of the complete genome 


|601872281F1 NlH_MGC_53 Homo sapiens cDNA clone IMAGE:4092981 5' 


j601143974F1 NIH_MGC_15 Homo sapiens cDNA clone IMAGE:3051234 5' 


1 COLLAGEN ALPHA 1(XVI) CHAIN PRECURSOR 


I 

| 


!zl66f04.s1 Stratagene colon (#937204) Homo sapiens cDNA clone IMAGE:5( 


|UI-H-BI1-acy-c-07-0-Ul.s1 NCI_CGAP_Sub3 Homo sapiens cDNA clone IM; 


!ai65a12.s1 Soares_testis NHT Homo sapiens cDNA clone 1375678 3' simila 
(RIBOSOMAL PROTEIN L32 (HUMAN); 


IQV4-BT0407-280100-090-e10 BT0407 Homo sapiens cDNA 


jCMO-UM0001-060300-270-e12 UM0001 Homo sapiens cDNA 


Canis familiaris inducible nitric oxide synthase mRNA, complete cds 


j601816291F1 NIHJ/IGC_56 Homo sapiens cDNA clone IMAGE:4050071 5' 


African swine fever virus, complete genome 


I Rat Ig germline epsilon H-chain gene C-region, 3' end 


Iah99a05.s1 Soares NFL T GBC S1 Homo sapiens cDNA clone IMAGE:13 
iTIGHT JUNCTION PROTEIN ZO-1 (HUMAN); 


iHomo sapiens chromosome 21 segment HS21C010 


Homo sapiens regulator of Gz-selective protein signaling (ZGAP1) mRNA, an 


|2SS PROTEASOME REGULATORY SUBUNIT S3 (NUCLEAR ANTIGEN 2 


28S PROTEASOME REGULATORY SUBUNIT S3 (NUCLEAR ANTIGEN 2 


j Rabies virus isolate b615 glycoprotein gene, partial cds 
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3 
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3 
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z 
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Z 






SIMS| 
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£2 


w 
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z 


eg 


z ; 




SIMS 
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§ 

! 




















































i 1 










|BE565003.- 


|BE539214/ 


< 




[AJ230796.1 


[AA5844S5: 


3 


JAW9D0962 


|lO2290.1 


l 

< 


I 


JBE304764' 


§ 


jX96677.1 


'AA056343.- 


JAW138152 


AA815438- 


IBE070284.- 


JAW7929B2 


I 

< 


1 


1 


^02901.1 


1 


AL1 6321 0.2 


c 


1 


§ 


£ 


Most Similar 
(Top) Hit 
BLAST E 
Value 


I 


i 


I 7.2E-02| 


| 7.2E-02| 


1 


| 7.2E-02| 


7.2E-02| 




| 


! 7.1E-02I 


! 7.1E-02| 


s 


! 7.0E-02I 


j 7.0E-02| 


S 


| 7.0E-02| 




7.0E-02| 


! 7.0E-02| 


j 7.0E-02| 


! 7.0E-02I 


s 


! 7.0E-02| 


i 7.0E-02I 


6.9E-02I 


6.9E-02 


I 6.9E-02| 


| 6.9E-02| 


! 6.9E-02| 
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ORFSEQ 
ID NO: 






! 28407| 












1 




22028 | 




20270 | 




21 495 [ 


2271 5 | 




23642| 






24510| 


27329 1 


• 27597I 




j 20256I 




| 23421 | 


j 23422 [ 




lag 
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SEQ ID 
NO: 
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Top Hit Descriptor 


Human hereditary haemochromatosis region, histone 2A-like protein gene, hereditary haemochromatosis 
(HLA-H) gene, RoRat gene, and sodium phosphate transporter (NPT3) gene, complete cds 


Homo sapiens mucin 5B (MUC53) gene, partial cds | 


Drosophila melanogaster mRNA for mod(mdg4)51 .4 protein | 


Mus musculus major histocompatibility locus class III regions Hsc70t gene, partial cds; smRNP, G7A, NG23, 
MutS homolog, CLCP, NG24, NG25, and NG26 genes, complete cds; and unknown genes 


HEAT SHOCK PROTEIN 70 HOMOLOG i 


Hepatitis G virus RNA for polyprotein (NS5A region), partial cds, strain: CMR-152 j 


| 

1 

| 
< 

D 

8 
| 

s, 

o 

X 

z 


TRANSCRIPTIONAL REGULATORY PROTEIN ALGP (ALGINATE REGULATORY PROTEIN ALGR3) j 


Rattus norveqicus differentation-associated Na-dependent inorganic phosphate cotransporter (DNPI) mRNA, 


complete cds 


52 KD RO PROTEIN (SJOGREN SYNDROME TYPE A ANTIGEN (SS-A)) (RO(SS-A)) (R052) | 


AV705701 ADB Homo sapiens cDNA clone ADBBAB03 5' j 


Mus musculus stromal cell derived factor receptor 2 (Sdfr2), mRNA j 


Metarhizium anisopliae mRNA for Chymotrypsin (chyl gene) | 


Aquifex aeolicus section 82 of 109 of the complete genome j 


7l37h08.x1 Soares_NSF_F8_9W_OT_PA_P_Sl Homo sapiens cDNA clone IMAGE:352381 5 3' similar to I 
TR:Q9Y4S6 Q9Y4S6 HYPOTHETICAL 30.3 KD PROTEIN. [1] ; j 


Human mRNA, Xq terminal portion ! 


^Arabidopsis thaliana K+ inward rectifying channel protein (AtKCI) gene, complete cds j 


iHomo sapiens mRNA for KIAA1464 protein, partial cds | 


H. sapiens mRNA for B-HLH DNA binding protein i 


IL3-HT0618-110500-136-CQ5 HT0S18 Homo sapiens cDNA ! 


S.japcnicum mRNA for serine-enzyme j 


Homo sapiens chromosome 21 segment HS21C007 j 


Thermotoga maritima section 89 of 136 of the complete genome [ 


EST380924 MAGE resequences, MAGJ Homo sapiens cDNA j 


Mesocestoides corti mitochondrial DNA, NADH dehydrogenase subunit 4, tRNA-GIn, tRNA-Phe, tRNA-Met, 1 
: ATPase subunit 6, and NADH dehydrogenase subunit 2 j 


^zp78c04.r1 Stratagene HeLa cell s3 937216 Homo sapiens cDNA clone IMAGE:626310 5' | 


Top Hit 1 
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i SOUrCe 


Z 


2 


2 


z 


SWISSPROT | 


2 




SWISSPROT I 




z 


ISWISSPROT I 


|EST HUMAN | 
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EST HUMAN I 


z 


z 
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lEST HUMAN ] 


z 


2 


2 


jEST HUMAN j 


2 


5 


1 
| 
- 


d 
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U91328.1 


AF107890.1 | 




AF1 09905.1 


1 




BF210736.1 I 


1 ; 


i 


AF271 235.1 I 


a 


AV705701.1 | 


1 


AJ242735.1 j 


1 


,BF1 12039.1 




U73325.1 






IBE1795431 | 


1 


8 
< 


| 
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1 
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I 
< 


1 
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(Top) Hit 
BLAST E 
Value 
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s I 
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ORFSEQ 
ID NO: 


29050 




25295 | 


21490 




27825 1 












275251 


28827 | 




25265I 


20038 1 




24785 1 


1 


28248' 






8 
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SEQ ID 

NO: 


1 8754 
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19017 
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1 7602 


I5837I 






14373 




17319 


1 8544 
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SEQ ID 
NO: 
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1 


I 9337] 
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I 7752I 




1 93731 
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I 7459I 
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Top Hit Descriptor 


Mus muoculu3 tuftelin 1 (Tuftl ), mRNA 1 


Homo sapiens elF4E-transporter (4E-T), mRNA I 


Homo sapiens elF4E-transporter (4E-T), mRNA I 


Mus musculus second IL1 1 receptor alpha chain (IL1 1 Ra2) gene, exons 1 and 2 | 


Citrobacter freundii DSM 30040 cyclopropane fatty acid synthase (cfa) gene, partial cds, dihydroxyacetone 
kinase (dhaK), glycerol dehydrogenase (dhaD), transcriptional activator (dhaR), 1 ,3-propanediol 
dehydrogenase (dhaT), glycerol dehydratase (dhaB),> 


Oryza sativa rbbi3-1 gene for putative Bowman Birk trypsin inhibitor I 


RC5-BT0559-1 40200-01 2-C03 BT0559 Homo sapiens cDNA j 


Xenopus laevis homeobox protein (Vox-1 ) mRNA, complete cds I 


Mus musculus p-glycoprotein (mdria) gene, exons 1 and 2 I 


f 

1 

E 

f 
O 

a 

> 
j 
I 

1 
1 

i 

i 

2 
1 


Rana catesbiana heat shock protein 30 (HSP30) mRNA, complete cds I 


Q V0-ST021 3-021 299-062-a09 ST021 3 Homo sapiens cDNA j 


1 
1 
1 
I 

1 

i 

1 
1 

a 


ye37f12.r1 Stratagene lung (#937210) Homo sapiens cDNA clone IMAGE:119951 5' similar to gb:K01 506 
HLA CLASS II HISTOCOMPATIBILITY ANTIGEN, DP(1) ALPHA CHAIN (HUMAN); 


Pseudomonas putida ttgS gene [ 


Drosophila melanogaster laminin B2 gene, complete cds 1 


Drosophila melanogaster laminin B2 gene, complete cds " 


I 
s 
1 

8 
s 

1 


: Arsbidopsis thaliana eli5 gene, exons 1-11 j 


IMus musculus caudal type homeobox-1 (Cdx-1 ) gene, complete cds j 


Helicobacter pylori 26695 section 5 of 134 of the complete genome 


Helicobacter pylori 26695 section 5 of 134 of the complete genome I 


| Lymphocystis disease virus 1 , complete genome J 


jnuclear protein T1F1 isoform [mice, mRNA, 4053 nt] I 


; Podospora anserina mitochondrial epsilon-sen DNA 1 


jHomo sapiens meprin A, alpha (PABA peptide hydrolase) (MEP1 A) mRNA j 


iHomo sapiens partial LM01 gene for LIM domain only 1 protein, exon 1 ! 


JHomo sapiens partial LM01 gene for LIM domain only 1 protein, exon 1 ) 


| Human steroid hormone receptor Ner-I mRNA, complete cds I 


1 Drosophila melanogaster filament protein homolog (sep1 ) gene, complete cds | 


Top Hit 
Database 


Z 


z 


z 








EST HUMAN J 


Z 


z 


z 




i, 


,EST HUMAN | 


IEST HUMAN | 


z 


z 


Z 




z 


z 


z 


z 


z 


z 


z 


z 




z 


z 




Top Hit Acession 
No. 






109470341 


I 


U09771.1 


IAJ277468.1 | 


i 

s 


jU53528.1 | 


1 
I 


Iu20790.1 I 


i 


AW391248.1 I 


IAW391248.1 | 


i 

i 


|AJ276408.1 | 


IM58417.1 | 


1 


1 


1 
1 


i 

s 

2 


1 
< 




S 

1 


|S78221.1 1 


1X03127.1 I 


1 
I 


|AJ277661.1 j 


|AJ277661.1 | 


|U07132.1 ! 


|L33246.1 


Most Similar 
(Top) Hit 
BLAST E 
Value 


5.5E-02| 


9 

»■ 


5.5E-02J 


5.5E-02| 


5.5E-02 
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1 5.4E-02] 
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! 5.3E-02! 


| 5.3E-02 
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I 5.3E-02, 


S 
«. 


I 5.3E-02i 
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1 5.2E-02 
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S 
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ORF SEQ 
ID NO: 


s 


276321 


27633 




23508 






24563 








20797I 


20798 






22638 




22834 


24185 


24662I 


1 


24956 






I 27349 




22793 


I 22794 


1 23854 




SEQ ID 
NO: 


iS228 








s 


a 
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1 4787 




S7974 




0955 
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! 3003 1 


I3003j 


1 
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s 
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I 3375I 
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1 



2(12 



WO 01/57274 PCT/US01/00666 




203 



WO 01/57274 



PCT/US01/00666 




204 



WO 01/57274 



PCT/US01/00666 



?5 



x | 8 



IP 



205 



WO 01/57274 PCT/US01/00666 




206 



WO 01/57274 



PCT/US01/00666 



Top Hit Descriptor 


1 60 1 1 77907F1 NIH_MGC_1 7 Homo sapiens cDNA clone IMAGE:3533353 5' ] 


|QV1-NN001 2-1 80400-1 64-f06 NN0012Homo sapiens cDNA j 


| Homo sapiens Kl AA0867 protein (KIAA0867), mRNA | 


Fugu rubripes neural cell adhesion molecule |_1 homolog (L1-CAM) gene, complete cds; putative protein 1 
(PUT1 ) gene, partial cds; mitosis-specific chromosome segregation protein SMC1 homolog (SMC1 ) gene, 
complete cds; and calcium channel alpha-1 subunit> 


jBrassica napus gin gene for plastid glutamine synthetase, exons 1-12 i 


iHomo sapiens mRNA for KIAA1471 protein, partial cds | 


Homo sapiens cytochrome P450 polypeptide 43 (CYP3A43) gene, partial cds; cytochrome P450 polypeptide 
4 (CYP3A4) and cytochrome P450 polypeptide 7 (CYP3A7) genes, complete cds; and cytochrome P450 
polypeptide 5 (CYP3A5) gene, partial cds 


Strongylocentrotus purpuratus homolog Df human bone morphogenetic protein 1 (submp) mRNA, complete I 
|cds | 


GLUCOAMYLASE S1/S2 PRECURSOR (GLUCAN 1 ,4-ALPHA-GLUCOSIDASE) (1 ,4-ALPHA-D-GLUCAN I 
I GLUCOHYDROLASE) | 


Methanobacterium thermoautotrophicum strain Marburg, Thiol:fumarate reductase subunit A j 


! Kluyveromyces lactis gene for Ca++ ATPase \ 


Ovis aries mRNA for acetyl-coA carboxylase | 


|UI-H-BW1-anx-h-08-0-Ul.s1 NCI_CGAP_Sub7 Homo sapiens cDNA clone IMAGE:3084134 3' | 


j FAS ANTIGEN LIGAND | 


M.musculus DNA for desmin-binding fragment DesD7 | 


Homo sapiens succinate dehydrogenase complex, subunit C, integral membrane protein, 15kD (SDHC) I 
mRNA | 


^Homo sapiens hypothetical protein PR01163 (PR01163), mRNA | 


Homo sapiens hypothetical protein PR01163 (PR01163), mRNA | 


601906848F1 NIH_MGC_54 Homo sapiens cDNA done IMAGE.4134779 5' | 


Felis catus G-CSF gene for granulocyte colony-stimulating factor, complete cds | 


IHuman germline T-ce]| receptor beta chain TCRBV17S1A1T, TCRBV2S1, TCRBV10S1P, TCRBV29S1P, 
TCRBV19S1P, TCRBV15S1, TCRBV11S1A1T, HVB relic, TCRBV28S1P, TCRBV34S1, TCRBV14S1, 
TCRBV3S1, TCRBV4S1A1T, TRY4, TRY5, TRYS, TRY7, TRY8, TCRBD1, TCRBJ1S1, TCRBJ1S2,> 


Mus musculus chromosome X contigB; X-linked lymphocyte regulated 5 gene, Zinc finger protein 275, Zinc 1 
! finger protein 92, mmxq28orf j 


601 51 0891 F1 NIH_MGC_71 Homo sapiens cDNA clone IMAGE:391221 5 5' | 
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|AB0409 


I 


L23838. 
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1 




8 




25135| 


I 


21084| 


1 






24749| 










21683| 


SEQ ID 

NO: 


| 13736| 


1 
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1 
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! 




1 


11020| 


11229| 


11813| 




E 


14973 | 






19248 




11804| 


Probe 
SEQ ID 

NO: 


| 3824 
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1 
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1 9873] 


1 


5304 




| 7059 | 


1 7572| 
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| 1104| 


| 1322| 
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| 2671 | 


| 5105| 


| 5105| 


| 6581 | 
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Top Hit Descriptor 


Homo sapiens ATP-binding cassette, sub-family A (ABC1 ), member 8 (ABCA8), mRNA | 


Human von W illebrand factor gene, exons 23 through 34 | 


Homo sapiens PELOTA (PELOTA) gene, complete cds | 


LAMININ ALPHA-1 CHAIN PRECURSOR (LAMININ A CHAIN) j 


Homo sapiens plasma membrane calcium ATPase isoform 1 (ATP2B1) gene, alternative splice products, I 
! partial cds ' 


wr85e08.x1 NCI CGAP Kid1 1 Homo sapiens cDNA clone IMAGE:2494502 3' | 


Homo sapiens mRNA for KIAA0718 protein, partial cds | 


EOMESODERMIN [ 


501896233F1 NIH MGC 19 Homo sapiens cDNA clone IMAGE:4125584 5' I 


Mus musculus potassium large conductance pH-sensitive channel, subfamily M, alpha member 3 (Kcnma3), 1 
jmRNA 1 


601762117F1 NIH_MGC_20 Homo sapiens cDNA clone IMAGE:4024973 5' | 


Homo sapiens solute carrier family 22 (organic cation transporter), member 1 (SLC22A1 ), mRNA j 


H.vulgare Ss1 gene for sucrose synthase | 


Homo sapiens genomic region containing hypervariable minisatellites chromosome 10[10q26.3] of Homo j 


Homo sapiens genomic region containing hypervariable minisatellites chromosome 1 0[1 0q26.3] of Homo | 


CM2-EN001 3-1 1 0500-1 92-b1 0 EN0013 Homo sapiens cDNA | 


CM2-EN001 3-1 1 0500-1 92-M 0 EN0013 Homo sapiens cDNA j 


Chromatium vinosum sulfur globule protein Cv2 precursor (sgp2) gene, complete cds | 


nw20e05.s1 NCI_CGAP_GCB0 Homo sapiens cDNA clone IMAGE:1241024 3' similar to gb:J00314_rna2 I 
TUBULIN BETA-1 CHAIN (HUMAN); j 


Dictyostelium discoideum unknown spore germination-specific protein-like protein, orf 1 , orf2 and orf3 genes, I 
i complete cds j 


Dictyostelium discoideum unknown spore germination-specific protein-like protein, orfl , orf2 and orf3 genes, j 
'complete cds j 


Drosophila melanogaster tiggrin mRNA, complete cds 1 


Homo sapiens microsomal epoxide hydrolase (EPHX1) gene, complete cds | 


6020851 36F1 NIH_MGC_83 Homo sapiens cDNA clone IMAGE:4249377 5' | 


6020851 36F1 NIH_MGC_83 Homo sapiens cDNA clone IMAGE:4249377 5' | 
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Top Hit Descriptor 


(CYSTATHIONINE BETA-LYASE PRECURSOR (CBL) (BETA-CYSTATHIONASE) (CYSTEINE LYASE) | 


I EST370539 MAGE resequences, MAGE Homo sapiens cDNA j 


j HYPOTHETICAL 80.7 KD PROTEIN IN SOD1-CPA2 INTERGENIC REGION | 


|Maizeactin1 gene (MAc1), complete cds j 


1 
I 

a 
1 
1 

S 

o 1 
a 

I 


[L.lactis MG1363 grpE and dnaK genes j 


I PM1 -CT0326-291 299-002-h03 CT0326 Homo sapiens cDNA | 


I601178765F1 NIH MGC 20 Homo sapiens cDNA clone IMAGE:3543833 5' 


1 Homo sapiens mRNA for FU00013 protein, partial cds | 


Homo sapiens mRNA for FLJ0Q013 protein, partial cds [ 


Homo sapiens mRNA for FLJ00013 protein, partial cds | 


1 Homo sapiens mRNA for FLJ00013 protein, partial cds j 


Ixv26d07.x1 Soares_NFL_T_GBC_S1 Homo sapiens cDNA clone IMAGE:2814253 3' similar to j 
:SW:C21 1JHUMAN P53801 PUTATIVE SURFACE GLYCOPROTEIN C210RF1 PRECURSOR ; | 


Homo sapiens hypothetical protein FLJ13220 (FLJ13220), mRNA j 


;Vc20e06.ii Stratagene lung (#93721 0) Homo sapiens cDNA clone IMAGE:81250 5' similar to contains I 
! MER29 repetitive element | 


Homo sapiens chromosome 21 segment HS21 C008 | 


RC3-FN0155-060700-011-d10 FN0155 Homo sapiens cDNA | 


RC6-UM001 5-21 0200-021 -A10 UM0015 Homo sapiens cDNA | 


M.musculus S-antigen gene promoter region | 


LA PROTEIN HOMOLOG (LA RIBONUCLEOPROTEIN) (LA AUTOANTIGEN HOMOLOG) | 


Gaenorhabditis elegans mRNA for DYS-1 protein, partial | 


Human lysyl oxidase-like protein gene, exon 3 | 


w!99d04.x1 NCI_CGAP_Brn25 Homo sapiens cDNA clone IMAGE:2433031 3' ! 


zq04f11.s1 Stratagene muscle 937209 Homo sapiens cDNA clone IMAGE-.628749 3' similar to 
TR:G1017425G10174Z5 

IPISGKPLPKVTLSRDGVPLKATMRFNTEITAENLTINLKESVTADAGRYEITAANSSGTTKAFINIWLDRPG 
PPT GPWISDITEESVTLKWEPPKYDGGSQVTNYILLKRETSTAVWTEVSATVARTMMKVMKL ... ; 


I 
O 

i 
1 

z 

.§ 
1 

I 

1 

v; 


Cricetulus griseus CYP2A1 7 mRNA for cytochrome P450 2A17, complete cds j 


Homo sapiens skeletal muscle LIM-protein 1 (FHL1) gene, complete cds ! 
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Top Hit Descriptor 


lAquifex aeolicus section 32 of 109 of the complete genome | 


yf25o09.r1 Scares fetal liver spleen 1NFLS Homo sapiens cDNA clone IMAGE:127888-5' j 


jHomo sapiens skeletal muscle LIM-protein 1 (FHL1) gene, complete cds | 


|Mus musculus tumor rejection antigen gp96 (Tra1), mRNA | 


j601853910F1 NIH_MGC_57 Homo sapiens cDNA clone IMAGE:4073787 5' | 


|601853910F1 NIH_MGC_57 Homo sapiens cDNA clone IMAGE:4073787 5' | 


16022471 71 F1 NIH_MGC_62 Homo sapiens cDNA clone IMAGE:4332497 5' | 


Jye49f11.r1 Scares fetal liver spleen 1NFLS Homo sapiens cDNA clone IMAGE:121101 5' j 


jHuman interleukin 1 1 (IL1 1 ) gene, complete mRNA | 


'Oryctolagus cuniculus gene encoding ileal sodium-dependent bile acid transporter | 


[Drosophila melanogaster heat shock protein 68 (hsp68) gene, hsp68d allele, complete cds ( 


[Drosophila melanogaster heat shock protein 68 (hsp68) gene, hsp68d allele, complete cds > 


Homo sapiens telomerase reverse transcriptase (TERT) gene, exons 7-1 6 and complete cds | 


I LARGE TEGUMENT PROTEIN | 


iOryctolagus cuniculus gene encoding ileal sodium-dependent bile acid transporter I 


601442431F1 NlH_MGC_65 Homo sapiens cDNA clone IMAGE:3846727 5' | 
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Xa52b04.x1 NCI CGAP Sar4 Homo sapiens cDNA clone IMAGE:2570383 3' similar to SW:Y069 HUMAN I 
Q15041 HYPOTHETICAL PROTEIN KIAA0069 ; j 


|qg27f11.x1 NCI_CGAP_Kid3 Homo sapiens cDNA clone IMAGE:1 76231 7 3' ( 


|601493473T1 NIH_MGC_70 Homo sapiens cDNA clone IMAGE:3895578 3' i 


:Rattus norvegicus Nerve growth factor receptor, fast (Ngfr), mRNA | 


iHomo sapiens KIAA1070 protein (KIAA1070), mRNA | 


jHomo sapiens chromosome 21 segment HS21C103 | 


!zs84c02.r1 NCI_CGAP_GCB1 Homo sapiens cDNA clone IMAGE:704162 5' | 


|UI-HF-BN0-akj-e-10-0-UI.M NIH_MGC_50 Homo sapiens cDNA clone IMAGE:3077466 5' i 


1602015501 F1 NCI_CGAP_Brn64 Homo sapiens cDNA clone IMAGE:4150944 5' | 


jon25f06.y5 NCI_CGAP_Lu5 Homo sapiens cDNA clone IMAGE:1557827 5' [ 


on26f06.y5 NCI_CGAP_Lu5 Homo sapiens cDNA clone IMAGE:1557827 5' j 


[G01SSG305R2 NIH_MGC_83 Homo sapiens cDNA clone IMAGE:3950665 3' | 


i601680305R2 NIH_MGC_83 Homo sapiens cDNA clone IMAGE:3950665 3' | 


Rattus norvegicus rabphilin-3A mRNA, complete cds | 


[H.carterae mRNA for fucoxanthin chlorophyll ate binding protein, Fcp1 | 


H.carterae mRNA for fucoxanthin chlorophyll a'c binding protein, Fcp1 [ 


PM2-NN01 28-080700-001 -a12 NN0128 Homo sapiens cDNA | 


|PM2-NN0128-080700-001-a12 NN0128 Homo sapiens cDNA I 


O 

3 

1 

1 
I 

z 


1 7e30e09.x1 NCI_CGAP_Lu24 Homo sapiens cDNA clone IMAGE:3284008 3' similar to contains L1 .t1 L1 I 
repetitive element ; | 


601 579393F1 NIH_MGC_9 Homo sapiens cDNA clone IMAGE:3928054 5' \ 


O 

3 

1 
< 

z 

■a 
I 
1 

1 

< 

8 

1 


1 
O 

3 

1 
0 

1 

.1 

1 

I 

8 
5 

1 


HYPOTHETICAL 46.7 KD PROTEIN C19G10.05 IN CHROMOSOME I | 


HYPOTHETICAL 46.7 KD PROTEIN C19G10.05 IN CHROMOSOME I | 


Bos taurus partial stat5B gene, exons 1 7-19 j 


Mus musculus major histocompatibility locus class II region: major histocompatibility protein class II alpha 
chain (lAalpha) and major histocompatibility protein class II beta chain (lEbeta) genes, complete cds; 
butyrophilin-like (NG9), butyrophilin-li> 


Homo sapiens gene for LECT2, complete cds j 


Top Hit 
Database 


|NT 


INT I 


!est_human I 


X 

a 


I EST HUMAN [ 


NT i 




NT | 


z 

I 

X 


EST HUMAN [ 


EST HUMAN | 


X 


EST HUMAN | 


Z 

5 


EST HUMAN | 


NT | 


z 


NT \ 


z 

3 


EST HUMAN | 


EST_HUMAN | 


EST HUMAN | 


EST_HUMAN | 


EST_HUMAN | 


EST HUMAN | 


SWISSPROT" | 


SWISSPROT | 


z 


NT 


NT I 


) Hit Acession 
No. 






241154.1 | 


1 


i 


6981271| 


11432020 1 


63303.2 j 






43827.1 | 


)3130.1 




| 






1 


S 


01165.1 | 


01165.1 | 


1 


70128.1 | 


l 


26722.1 i 








i 


50157.1 


07546.1 | 




1 


AEO 




s 

< 








< 




3 
< 




- 

< 


AI7E 


BE9 




3 




1 


BE7 








BE7 


BF5 




Q10 


Q10 


< 


AFO 


ABO 


Most Similar 
(Top) Hit 
BLAST E 
Value 


| 2.6E-02] 


i 2.6E-02| 


| 2.6E-02I 


9 

1 


| 2.6E-02| 


9 


i 


|. 2.6E-02| 


' 2.6E-02| 


9 

a 


2.6E-02| 


I 2.5E-02| 


9 

w 


1 2.5E-02| 


l 2.5E-02| 


2.5E-02| 


2.5E-02| 


2.5E-02| 


2.5E-02| 


2.5E-02| 


2.5E-02| 


2.5E-02[ 


2.5E-02| 


2.5E-02| 


2.5E-02| 


2.5E-02| 


2.5E-02| 


2.5E-02| 


2.5E-02 


2.5E-02| 


Signal 


I 




2.06 




s 


















1 12 -83] 




8 






I 1.09[ 


| 1.09| 




? 


i 3.86| 






I 2.45| 


I 2.45| 


! 3.38 [ 




I 1-74 


O 
























































!8370l 






O ~ 






























































SEQID 
NO: 


1 


1 


1 




I 15776 


| 15977| 








1 


| 19715| 


1 


10463 


| 10721] 


| 10778' 




1 12853] 




| 15070| 


1 15070] 


1 


1 


1 




s 


| 18071 | 


| 18071 i 


| 18118| 






Probe 
SEQID 
NO: 


1 4811 




| 5025 


| 5771 


| 5870 


| 6130| 


s 


s 


I 


1 






5 


1 


s 


1 


s 


1 


1 


S 
8 




I 


1 


| 6519] 


1 


1 


| 8185[ 


1 


8255 


i 



213 



WO 01/57274 



PCT/US01/00666 



ii 



8? 



ill 



Pss 



IP 



214 



WO 01/57274 



PCT/US01/00666 



11 



p33 



215 



WO 01/57274 PCT/US01/00666 




216 



WO 01/57274 



PCT/US01/00666 



Top Hit Descriptor 


Mus musculus DinB homolog 1 (E. coli) (Dlnbl), mRNA I 


LO 

s 

1 

1 

1 

1 
1 

I 


Mus musculus DinB homolog 1 (E. coli) (Dinbl ), mRNA I 


Homo sapiens genomic region containing hypervariable minisatellites chromosome 1[1 p36.33] of Homo | 
sapiens j 


mRNA [ 


< 

1 


mRNA | 


rrRNA j 


i 


7g51 c08.x1 NCLCGAP_Pr28 Homo sapiens cDNA clone IMAGE:3309998 3' similar to contains MER1 .t3 1 
MER1 repetitive element ; J 


Mus musculus sema domain, transmembrane domain (TM), and cytoplasmic domain, (semaphorin) 6B 1 
(Sema6b), mRNA 


Arabidopsis thaliana C2H2 zinc finger protein FZF mRNA, complete cds j 


P. vulgaris hydroxyproline-rich glycoprotein (HRGP) mRNA, 3' end 1 


Homo sapiens chromosome 21 segment HS21C078 j 


aa15b10.r1 Soares_NhHMPu_S1 Homo sapiens cDNA clone IMAGE:813307 5' j 


Japanese encephalitis virus envelope protein mRNA, partial cds I 


wa1 7b02.x1 NCl_CGAP Kid1 1 Homo sapiens cDNA clone IMAGE:2298315 3' I 


Mycobacterium tuberculosis H37Rv complete genome; segment 93/162 j 


Equus caballus DNA for 1 7alpha-hydroxylase/1 7,20-lyase, complete cds j 


Homo sapiens ankyrin 3, node of Ranvier (ankyrin G) (ANK3), transcript variant 1 , mRNA I 


Homo sapiens ankyrin 3, node of Ranvier (ankyrin G) (ANK3), transcript variant 1 , mRNA | 


aa15b10.r1 Scares NhHMPu S1 Homo sapiens cDNA clone IMAGE:813307 5' I 


Arabidopsis thaiiana DNA chromosome 4, contig fragment No. 32 ! 


601478819F1 NIH MGC 68 Homo sapiens cDNA clone IMAGE:3881477 5' I 


yd04c09.r1 Soares infant brain 1NIB Homo sapiens cDNA done IMAGE:24675 5' I 


nf19a07.s1 NCI_CGAP_Pr1 Homo sapiens cDNA clone IMAGE:914196 similar to contains L1.t1 L1 | 
repetitive element ; j 


EMPTY SPIRACLES HOMEOTIC PROTEIN 1 


Homo sapiens chromosome 21 segment HS21 C1 03 1 


Homo sapiens chromosome 21 segment HS21 C1 03 I 


Arabidopsis thaliana DNA chromosome 4, contig fragment No. 50 | 


Homo sapiens hypothetical protein FLJ 10379 (FLJ 10379) 


Homo sapiens hypothetical protein FLJ 10379 (FLJ 10379) 


Homo sapiens hypothetical protein FLJ10486 (FLJ10486) 


Homo sapiens hypothetical protein FLJ10486(FLJ10486) 


Arabidopsis thaliana DNA chromosome 4, comig fragmen 


Top Hit 


1 8 

1! 


z 


EST HUMAN | 


z 


z 




H 


z 






EST HUMAN | 


z 


z 


z 


z 


X 


z 


| 
1 






z 


z 


EST HUMAN I 


z 


EST HUMAN 


EST HUMAN 


EST HUMAN 


SWISSPROT 


z 


z 


z 


TopHitAcession 


2 


6753635] 


AA456538.1 | 


6753635 1 


AL096805.1 \ 


§ 

1 




3 


8922453J 


AL161632.2 i 


BF002932.1 


8 
R 


AF095588.1 


1 
I 


AL163278.2 


AA456538.1 


U70408.1 


AI640342.1 


Z73966.1 


§ 


10947055 


10947055 


AA456538.1 


AL1 61 532.2 


1 




AA572764.1 


P18488 


AL1 63303.2 


AL1 63303.2 


AL1 61550.2 


Most Similar 
(Top) Hit 
BLAST E 
Value 


I 


2.0E-02I 


2.0E-02I 


2.OE-O2I 


2.0E-02I 


2.0E-02 


2.0E-02I 


2.0E-02 


2.0E-02I 


9 


3 


2.0E-02I 


2.0E-02! 




2.0E-02 


2.0E-02 


2.0E-02 


2.0E-02 




8 


2.0E-02 


2.0E-02 


2.0E-02 


2.0E-02 


2.0E-02 


9E-02 


9E-02 


9 


9 


9E-02 












§ 

I 


1 
$ 




s 


























8 




8 






= 


















5 




ORF SEQ l 


6 

Q 




20077 


































1 

a 




1 


24730 










1 


21779 


21780 




SEQ ID 
NO: 




10256 




10987 






11727 




12630 


1 


13031 




13834 


1 


! 


17529 










18739 


I 




19536 




1 


11504 


11887 


11887 


12335 


1 Probe 1 


NO: 


8 


Si 










1 




1 




3105 


1 




5079 




7679 




8037 




1 


1 


9019 ! 


1 


9928 


1 




1 


1 


1 


I 2458 



217 



WO 01/57274 PCT/US01/00666 




218 



WO 01/57274 PCT/US01/00666 




219 



WO 01/57274 



PCT/US01/00666 



2* 

Si 
"a 

15 
1 3.. 
?§3 

ill 

II' 

ill 

5 II 
If I 



§2 



2 Z 



E I |_ a 



220 



WO 01/57274 



PCT/US01/00666 




WO 01/57274 PCT/US01/00666 




222 



WO 01/57274 PCT/US01/00666 




WO 01/57274 



PCT/US01/00666 











z 

5 




















I 








































I 














1 






8 
is 




































1 




% 
2 
a 
E 














1 

1 






I 
§ 


















I 
t 














Top Hit Descriptor 


| 

1 

1 
9 

I 


3 

s 

1 
< 

§ 
1 

X 




1^ 

< - 

is 


2 

Q 

1 

E 


< 

Q 

! 
i 


D 

§ 
o 

z 

1 
1 


< 
H 
D 

I 

1 

J 


D 

i 


z 


z 

9 

1 

a 
s 

I 


1 

i 

a 

I 
1 
1 

i 




1 

1 
< 
D 

! 


Q 

1 

1 


Q 
| 


D 
| 

1 

a 
1 


z 
□ 

1 


1 

r? 

i! 
ii 


< 

.1 
1 
1 


< 

I 

f 

i 

i 


S 
I 
§ 

1 

i 
i 


i 

i 
1 

i 

j 


to 

1 

lij 

O 
1 

1 
I 


| 

1 
t 

i 


D 

1 
| 




1 

1 
1 

< 

! 


i 

r>. 
% 
1 

1 
< 

S 




1 

X 

fe 

E 

1 

o 
z 

t 

I 


i 

Q. 

I 
1 

i 
I 

1 
I 


§ 

I 

1 
•1 

3 

1 


fi 
11 
§° 

o gc 
§5 
o ^ 

*i 
11 

to 


1 

r 

j 
1 


! 

i 


I 
1 

1 
1 

1 

Q 


1 
% 

a 


X 

s 

X 

I 
1 

1 

1 


5 

X 

5 

§ 
g 

0 

z 


z 

E 
1 

i 
| 
1 

i 
I 
1 


o 

o 

X 

¥ 
I 

S 

o 


i 

E 

§ 
I 

1 
1 

| 


1 
1 

1 fe- 
ll 

li 
•§ < 


X 

o 

I 
S 

s 

1 


E 

9 
o 


| 

8 

o 

1 
i 


X 

1 

1 

§ 

1 

% 


II 

I c 

a ^ 
5 , § 
i 1 ? 

§ § 


! 
i 

i 

I 

1 
1 

X 


I 

1 

8 

! 

1 
1 
1 


1 

1 
i 
1 

i 
i 


< 
? 
1 

1 

I 
1 


1 
X 

8 

8 

2 
z 

8 

? 

§ 


8 

j 
1 


8 
1 
S 

1 
1 

s 


1 

8 

i 
< 

j 


I 

1 

X 

8, 

x' 
z 

1 
8 


j 

i 
8, 
o 
o 
2 

X 

z 

1 

8 


111 


1 


I 




i 


1 


s 


| 


| 


| 


o 
1 


| 






1 


s 


< 
5 


I 


1 




11 




1 




< 


< 


1 




1 


1 


Q 




M 




to 


co 


CO 


CO 






§ 








S 








a 




3 2 




















1 


























8 
















I 


















<z 


1 




1 


i 


1 




1 




1 


a 


I 


1 


1 


1 




i 




1 


1 


ii 








i 


g 






1 


1 


Topi 




. 












I 




a 


i 


1 






1 


1 


I 










I 






1 


1 


1 






M _ m 


9 




| 


1 


2 




s 




9 


9 






9 




9 


9 


9 


9 


3 


? 9 


9 


9 


9 


S 


S 


S 




S 


S 


E I h 3 
o _ m 




























































F 




I 


1 




I 


§ 






s 




» 






3 




§ 




s 




IS 




— 


1 


2 








8 


s 


ORF SEQ 
ID NO: 






— 

1 


— — 

I 






To 
1 




"5" 


1 


— « 


■5- 

1 


"5" 
1 






— 




— 






1 


1 


I 




CD 


fe 


00 


To 
S 


75" 












1 










1 


g 
12 


I 


1 


1 


1 




? 




8 






1 


1 


8 






8 


































































Probe 
SEQ ID 
NO: 




1 




1 


1 


1 




1 




1 








1 




s 




8 




11 


i 


! 


! 


1 






I 







224 



WO 01/57274 PCT/US01/00666 




WO 01/57274 



PCT/US01/00666 





Top Hit Descriptor 


Escherichia coli microcin 24 region, DNA binding protein (mdbA), immunity protein (mtfl), microcin 24 (mtfS), 1 
|and microcin transport protein (mtfA, mtfB) genes, complete cds j 


RC1-HT0545-120200-011-b09 HT0545 Homo sapiens cDNA " 


Homo sapiens SCL gene locus | 


Xenopus laevis bone morphogenetic protein 4 (BMP-4) gene, complete cds j 


CM4-NN0119-300600-223-b05 NN0119 Homo sapiens cDNA \ 


Mus musculus major histocompatibility complex region NG27, NG28, RPS28, NADH oxidoreductase, NG29, 
KIFC1, Fas-binding protein, BING1, tapasin, RalGDS-like, KE2, BING4, beta 1 ,3-galactosyl transferase, and 
RPS18 genes, complete cds; Sacm21 gene, partial* 


Pyrococcus horikoshii OT3 genomic DNA, 287001-544000 nt. position (2/7) j 


PROBABLE PEPTIDASE Y4NA i 


A.californica (marine gastropod mollusc) neuropeptide gene (bag cell), exon 1 , 5' end j 


Tursiops truncatus mRNA for p40-phox, complete cds | 


MR1-ST0111-111199-011-h06ST0111 Homo sapiens cDNA | 


QV1-BT0677-040400-131-g03 BT0677 Homo sapiens cDNA | 
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S.cerevisiaa chromosome X reading frame ORF YJR1 52w j 


od80a09.s1 NCI_CGAP_Ov2 Homo sapiens cDNA clone IMAGB1374232 | 


Od80a09.s1 NCI_CGAP_Ov2 Homo sapiens cDNA clone IMAGE:1374232 j 


Homo sapiens melanoma-associated antigen (MAGE-C1 ) gene, complete cds • 


Oryctolagus cuniculus el F-2a kinase mRNA, complete cds | 


Homo sapiens ABCG1 gene for ABC transporter (ATP-binding cassette, sub-family G (WHITE), member 1 ), I 
I complete cds J 


Cryptosporidium parvum HC-10 gene, complete cds j 


Cryptosporidium parvum HC-10 gene, complete cds | 


Glycine max glutathione S-transferase GST 21 mRNA, partial cds j 


AV731 712 HTF Homo sapiens cDNA clone HTFAZF10 5' j 


FORKHEAD BOX PROTEIN D3 (HNF3/FH TRANSCRIPTION FACTOR GENESIS) (HEPATOCYTE j 
|NUCLEAR FACTOR 3 FORKHEAD HOMOLOG 2) (HFH-2) j 


ab79b09.s1 Sfratagene fetal retina 937202 Homo sapiens cDNA clone IMAGE:853145 3' | 


xv21 b02.x1 Soares_NFL_T_GBC_S1 Homo sapiens cDNA clone IMAGE:2813739 3' ! 


HISTIDINE-RICH GLYCOPROTEIN PRECURSOR | 


UI-H-Bi3-akb-c-10-0-Ul.s1 NCI_CGAP_Sub5 Homo sapiens cDNA olone IMAGE:2733691 3' j 
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Top Hit Descriptor 


|Rattus norvegicus neuronal nicotinic acetylcholine receptor subunit (AlphalO) mRNA, complete cds j 


xs34f09.x1 NCI_CGAP_Ut1 Homo sapiens cDNA clone IMAGE2609033 3' similar to TR:Q12987 Q12987 I 
ACIDIC 82 KDA PROTEIN. ; j 


jhh39a05.yi NCI_CGAP_GU1 Homo sapiens cDNA clone IMAGE:2969936 5' ( 


|Homo sapiens chromosome 21 segment HS21 C078 j 


jho39h08.x1 Soares_NFL_T_GBC_S1 Homo sapiens cDNA clone IMAGE:3039807 3' similar to TR:093434 I 
093434 RETICULOCALBIN.; • | 


ho39h08.x1 Soares_NFL_T_GBC_S1 Homo sapiens cDNA clone IMAGE:3039807 3' similar to TR:093434 I 
|093434 RETICULOCALBIN. ; ) 


RC1-CT0286-050400-018-C08 CT0286 Homo sapiens cDNA | 


|zd33f1 0 r1 Soares_fetal_heart_NbHH19W Homo sapiens cDNA clone 1MAGE:342475 5' i 


IEST30674 Colon I Homo sapiens cDNA 5' end | 


7g34b10.x1 NCI_CGAP_Brn23 Homo sapiens cDNA clone 1MAGE:3308347 3' similar to TR:Q13387 I 
Q13387 HYPOTHETICAL PROTEIN 384D8_2. ;contains TAR1.t2TAR1 TAR1 repetitive element ; I 


| CM2-CT0478-230800-347-b1 1 CT0478 Homo sapiens cDNA | 


jS.cerevisiae chromosome II reading frame ORF YBL077W 1 


|S.cerevisiae chromosome II reading frame ORF YBL077w j 


JRC5-HT0582-1 60300-0 11-D02 HT0582 Homo sapiens cDNA j 


JBETA-GALACTOSIDASE PRECURSOR (LACTASE) ! 


BETA-GALACTOSIDASE PRECURSOR (LACTASE) | 


AV687379 GKC Homo sapiens cDNA clone GKCAFC07 5' | 


!Bos taurus mRNA for NDP52, complete cds j 


Iyv15h01.s1 Soares fetal liver spleen 1NFLS Homo sapiens cDNA clone IMAGE:242833 3' similar to contains I 
Alu repetitive element; | 


|601 1 451 54F2 NIH_MGC_19 Homo sapiens cDNA clone IMAGE:3160476 5' | 


RC0-SN0052-1 10400-C21-a04 SN0052 Homo sapiens cDNA j 


:hd22a05.x1 Soares_NFL_T_GBC_S1 Homo sapiens cDNA clone IMAGE:2910224 3' similar to I 
SW :PXR_HUMAN 075439 ORPHAN NUCLEAR RECEPTOR PXR ; | 


;hd22a05.x1 Soares_NFL_T_GBC_S1 Homo sapiens cDNA clone IMAGE:2910224 3' similar to j 
[SW :PXR_HUMAN 075469 ORPHAN NUCLEAR RECEPTOR PXR ; j 


jDanio rerio odorant receptor gene cluster | 


|ah78e1 1.s1 Soares_testis_NHT Homo sapiens cDNA clone 1321 772 3' j 


|ah78e11.s1 Soares„tesfis_NHT Homo sapiens cDNA clone 1321772 3' | 


Top Hit 
Database 


NT 


EST_HUMAN | 


EST HUMAN I 


NT ! 


EST_HUMAN | 




EST HUMAN | 


EST HUMAN j 


< 


:EST HUMAN 


1 
2 
Z> 


z 


NT | 


EST HUMAN j 


SWISSPROT | 


SWISSPROT | 


z 

X 


NT I 


EST HUMAN | 


EST HUMAN | 


EST HUMAN | 


EST HUMAN [ 


EST HUMAN [ 


NT ! 


lEST HUMAN | 


EST_HUMAN | 


Top Hit Acession 
No. 


3 
1 


| 


1 
1 
1 


|AL163278.2 


BE044191.1 


BE044191.1 


1 
< 


jW68251.1 


1 
1 


BE857385.1 


iBE928133.1 


1 


]Z35838.1 


] BE1 75667.1 




1 

£ 


S 


1 
< 


1 


1 


1 
< 


< 


AW511148.1 


]AF1 12374.1 


% 


|AA759135.1 j 


Most Similar 
(Top) Hit 
BLAST E 
Value 


| 7.0E-03 


1 7.0E-03 


s 


| 7.0E-03| 


| 7.0E-03 


7.0E-O3 


j 7.0E-03; 


! 7.0E-03[ 


| 7.0E-03[ 


7.0E-03 


| 7.0E-03f 


9 


j 7.0E-03| 


| 7.0E-03| 


s 


7.0E-03| 


7.0E-03| 


9 


7.0E-03 


9 


I 7.0E-03| 


9 


6.0E-03 


| 6.0E-03| 


9 


! 6.0E-03| 


Expression 






















| 2.39| 




s 


fi 














| 1.83j 




8 








ORFSEQ 
ID NO: 


| 23436 










s 
% 




1 


25967| 


8 

§ 






& 


26792| 




27671 | 














§ 


! 22499[ 




SI 


SEQID 
NO: 


| 13653 


| 14329 






| 1491o| 


1 






I 


1 


15992| 




16302| 


1 


1 7456| 


1 


s 


1 


1975t)| 


a 




11129I 


R 




1 


1 


Probe 
SEQID 
NO: 


| 3741 


| 4434 


s 

5 


1 


1 






| 5813[ 




I 


| 6210 | 


I 


| 6441 [ 


| 6723 | 


1 


| 7605| 


1 


| 8201 | 




1 96151 


1 


| 1220 1 


| 1220I 


| 2744| 


1 


1 



227 



WO 01/57274 



PCT/US01/00666 



Top Hit Descriptor 


yr77h04.r1 Scares fetal liver spleen 1 NFLS Homo sapiens cDNA clone IMAGE:21 1 351 5' ! 


|Nctoncus sp. cytochrome c oxidase subunit II gene, partial cds; mitochondrial gene for mitochondrial product | 


Fugu rubripes zinc finger- protein, isotocin, fatty acid binding protein, sepiapterin reductase and vasotocin | 
jgenes, complete cds j 


Fugu rubripes zinc finger protein, isotocin, fatty acid binding protein, sepiapterin reductase and vasotocin I 
|genes, complete cds | 


zc1 3a1 1 .r1 Soares_parathyroid_tumor_NbHPA Homo sapiens cDNA clone IMAGE:3221 72 5' [ 


UI-H-BI4-apm-c-06-0-Ul.s1 NCI CGAP_Sub8 Homo sapiens cDNA clone IMAGE:3087754 3' | 


Mus musculus glucosamine-6-phosphate deaminase (Gnpi), mRNA | 


RC0-CT0204-240999-021 -M 0 CT0204 Homo sapiens cDNA 
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yy62M0.s1 Soares mu tiple_sclerosis 2NbHMSP Homo sapiens cDNA clone IMAGE:2781 79 3' 


ov33c1 1 x1 Soares testis NHT Homo sapiens cDNA clone IMAGE:1 6391 24 3' j 


EST271 1 3 Cerebellum 1 1 Homo sapiens cDNA 5' end similar to EST containing Alu repeat | 
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Hcmo sapiens telomerase reverse transcriptase (TERT) gene, exons 7-16 and complete cds I 


ow13a04.x1 Soares _parathyroid_tumor_NbHPA Homo sapiens cDNA clone I MAGE:1 646670 3' similar to j 
contains MER10.b1 MER10 repetitive element ; j 


RCO-UM0051-210300-032-g02UM0051 Homo sapiens cDNA | 
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M.thermoformicicum complete plasmid pFV1 DNA | 


EST374237 MAGE resequences, MAGG Homo sapiens cDNA | 


Homo sapiens hypothetical zinc finger protein FLJ14011 (FLJ14011), mRNA j 


te91c12.x1 NCI_CGAP Pr28 Homo sapiens cDNA clone IMAGE:2094070 3' similar to TR:O00519 000519 I 
FATTY ACID AMIDE HYDROLASE. ; 


teS1c12.x1 NCI CGAP Pr28 Homo sapiens cDNA clone IMAGE:2094070 3' similar to TR:O00519 000519 1 
FATTY ACID AMIDE HYDROLASE. ; | 


Mus musculus zino-finger protein mRNA, complete cds | 


601 572746F1 NIH_MGC_57 Homo sapiens cDNA clone IMAGE:3839747 5' | 
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Top Hit Descriptor 


| Homo sapiens KIAA0345 gene product (KIAA0345), mRNA | 


te49b1 1 .X1 Soares_NFL_T_GBC_S1 Homo sapiens cDNA clone IMAGE:2090013 3' similar to contains Alu I 
j repetitive element; I 


'Homo sapiens chromosome 21 segment HS21 C009 j 


iHomo sapiens chromosome 21 segment HS21 C078 j 


iHomo sapiens chromosome 21 segment HS21 C006 


l PM4-BN0138-180600-002-b08 BN0138 Homo sapiens cDNA j 


!6011181S4F1 NIH_MGC_17 Homo sapiens cDNA clone IMAGE:3028D95 5' ] 


|7q74c09.x1 NCI_CGAP_Lu24 Homo sapiens oDNA clone IMAGE: 3' similar to contains Alu repetitive | 
|dement;contains element MER31 repetitive element ; [ 


hh02c07.x1 NCI_CGAPJ<id1 1 Homo sapiens cDNA clone IMAGE:2953932 3' similar to contains element I 
LTR5 repetitive element ; j 


|RC3-ST0281-240400-015-f03ST0281 Homo sapiens cDNA j 


Homo sapiens Grb2-associated binder 2 (KIAA0571), mRNA | 


Homo sapiens protein kinase CK2 catalytic subunit alpha gene, exon 1 j 


Homo sapiens protein kinase CK2 catalytic subunit alpha gene, exon 1 j 


jnc73c05.s1 NCI_CGAP_Pr2 Homo sapiens cDNA clone IMAGE:7B2984 similar to contains Alu repetitive | 


'Homo sapiens MHC class 1 region | 


i S .cereale (cv. Halo) mRNA for triosephosphate isomerase j 


Mus musculus intestinal trefoil factor gene, partial cds | 


,Mus musculus intestinal trefoil factor gene, partial cds j 


Arabidopsis thaliana rpoMt gene | 
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.Mus musculus alpha-l(XVIII) collagen (COL18A1) gene, exon 1 and 2 j 


C.elegans samdc gene | 


AV762392 MDS Homo sapiens cDNA clone MDSBSG01 5' ! 


AV762392 MDS Homo sapiens cDNA clone MDSBSG01 5' j 


iah04f09.y5 Gessler Wilms tumor Homo sapiens oDNA clone IMAGE:1 155689 5' | 


S.cereale (cv. Halo) mRNA for triosephosphate isomerase | 


Rattus norvegicus gdnf gene j 


xu8.P10.H3 conorm Homo sapiens cDNA 3' | 


iDKFZp761B0712_r1 761 (synonym: hamy2) Homo sapiens cDNA clone DKFZp761B0712 5' ; 
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3.0E-03| 


s 


3.0E-03| 


3.0E-03| 


9 


3.0E-03] 


3.0E-03| 


3.0E-03| 


leo-aoe 


3.0E-O3| 


3.0E-03| 


3.0E-03| 


3.0E-03| 


3.0E-03| 


3.0E-03| 


Expression 
Signal 


1 2.02, 








s 


| 3.95[ 


s 




1.57| 




I 1-521 


















4.53| 








7.43J 


1 7.43! 








4.37| 


| 0.88| 


ORFSEQ 
ID NO: | 


1 


R 




j 27229 1 


§ 












1 






21401 | 














8 






| 23591 | 


1 


8 




| 23978j 


I 


| 24309| 


SEQID 
NO: 


8 
1 


1 


1 7028 


I7036| 


iS345j 


1 9725 


1 






1 




1 0323 


1 0787 


11542 


1 2097 1 


12132 


12133 






1 


I3037 


1 


1 


i 




1 


1 


1 


14311 


14518 




























































Probe 
SEQID 
NO: 


6774 


8 


7151 


8 


1 


1 


B 


I 


1 


1 

8 




8 


1 


| 163s| 


| 2211| 


| 2248| 


1 


| 2249| 


1 


| 3044| 


| 3112| 


1 3371| 






1 


1 


1 


1 




1 
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4 
g 

1 

| 


1 ab1 8a08.x5 Stratagene lung (#93721 0) Homo sapiens cDNA done IMAGE:841 1 42 3' similar to contains Alu I 
| repetitive element; j 


|601 48271 5F1 NlH_MGC_68 Homo sapiens cDNA clone IMAGE:3885483 5' | 


|Homo sapiens RAP1, GTPase activating protein 1 (RAP1GA1) mRNA i 


|Homo sapiens RAP1, GTPase activating protein 1 (RAP1GA1) mRNA j 


|Homo sapiens hypothetical protein FLJ10539 (FU10539), mRNA j 


|Mus musculus mRNA for hypothetical protein (ORF2 ortholog) | 


Iaa13f10.ii Soares NhHMPu S1 Homo sapiens cDNA clone IMAGE:813163 5' j 


|Kluyveromyces marxianus pcpl3 gene for purine-cytosine permease i 


|Oryza sativa gene for bZIP protein, complete cds j 


u 
1 

1 

Q 

I 
< 

i 
Z 

o 
E 

1 

1 


jHETEROGENEOUS NUCLEAR RIBONUCLEOPROTEIN A2 HOMOLOG 1 (HNRNP A2(A)) j 


|Homo sapiens chromosome 21 segment HS21 C068 j 


(NONSTRUCTURAL PROTEIN V | 


Ihh80f10.x1 NCLCGAP_GU1 Homo sapiens cDNA clone IMAGE:29691 31 3' similar to contains L1 ,t1 L1 I 
[repetitive element ; | 


jArabidopsis thaliana DNA chromosome 4, contig fragment No. 85 | 


Iov03d12jc1 NCI_CGAP Kid3 Homo sapiens cDNA clone IMAGE:1636247 3' similar to gb:X57138 rnal I 
(HISTONE H2B.2 (HUMAN); I 


|CIRCUMSPOROZOITE PROTEIN PRECURSOR (CS) [ 


RETROVIRUS-RELATED POL POLYPROTEIN [CONTAINS: REVERSE TRANSCRIPTASE ; j 
ENDONUCLEASE] | 


JHETEROGENEOUS NUCLEAR RIBONUCLEOPROTEIN A2 HOMOLOG 1 (HNRNP A2(A)) | 


|Homo sapiens chromosome 21 segment HS21C103 j 


|Hcmo sapiens ATP/GTP-binding protein (HEAB), mRNA | 


jPneumocystis carinii kexin-like serine endoprotease mRNA, partial cds j 


jHcmo sapiens golgin-like protein (GLP) gene, complete cds j 


jHcmo sapiens trinucleotide repeat DNA binding protein p20-CGGBP (CGGBP) gene, complete cds | 


|Homo sapiens trinucleotide repeat DNA binding protein p20-CGGBP (CGGBP) gene, complete cds | 


IRETROVIRUS-RELATED POL POLYPROTEIN [CONTAINS: REVERSE TRANSCRIPTASE ; 1 
ENDONUCLEASE] j 


jpromrna-5.E07.rbvtumor Homo sapiens cDNA 5' ' | 


ot77b1 0.s1 Soares_to:al_fetus_Nb2HF8_9w Homo sapiens cDNA clone IMAGE:1622779 3' similar to I 

(contains L1 .t3 MER26 repetitive element ; 


Hi 


IUMAN , 


z 










s 

n 






z 

1 


O 




5PROT 


1 




| 


5PROT I 


1 


5PROT 














SPROT 


IUMAN j 


1 
=> 




EST ^ 


EST h 




Z 


Z 




EST h 






EST 1- 


1 




SWISS 








SWISS 


1 


% 






z 


z 


z 




3SIMS 


EST h 


EST 1- 


TopHitAcession 
No. 


I 

< 


|BE787945.1 i 


I 4506414| 


4506414| 


•" 


'AJ249981.1 j 


AA456701.1 j 


|AJ011419.1 ! 


IAB021736.1 j 


IN92580.1 I 


l 


IAL163268.2 j 


IQ9QM81 i 




AL161589.2 j 


< 


P08672 I 


g 


1 


|AL163303.2 j 


1 
1 


AF009222.1 


AF266285.1 


AF094481.1 




§ 


A1525056.1 | 


1 


Most Similar 
(Top) Hit 
BLAST E 
Value 


s 


i 3.0E-03| 


| 3.0E-03| 


s 


| 3.0E-03[ 


s 


3.0E-03| 


3.0E-03| 


3 


! 3.0E-03! 


3.0E-03| 


3.0E-03| 


3.0E-03| 


s 


3 


3.OE-O3I 


3.0E-03| 


s 


s 


3.0E-03| 


3.0E-03| 


3.0E-03| 


3.0E-03| 


3.0E-03| 


3.0E-03| 


s 


3.0E-03| 


3.OE-O3I 


.§_ 

2 .§> 

r 




| 7,09| 


















1.29| 












! 3.48| 


a 




4.05| 


1 




2.01 [ 




I 3.06| 


% 


2.02 1 




ORFSEQ 
ID NO: 


1 


| 24416| 


| 24733 1 


1 


| 24837| 


8 


I 25974I 


| 26297| 


1 
s 




| 27001 | 


| 27011 1 






j 27297| 










| 27931 1 




1 


| 28137| 




1 


% 




1 


log 


14613 


1 


! 


1 


9 


1 5330 1 


1-985 1 


16141 


1 






8 


1 


1 








I 


17592 


I 


18104 




I 


0598 1 






1 9542 






























































Probe 
SEQ ID 
NO: 


1 


| 4745| 


S 


8 
S 


| 5220 1 


| 5410 | 


! 


6277 


i 




1 




I 


1 


I 7230 I 






I 


1 7742| 




| 8222| 


! 


| 8744| 


| 8773 1 


| 8773| 


1 


8 
8 


8 
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133 of 41 3 
Fable 4 

bes Expressed in Heart 


Top Hit Descriptor 

Homo sapiens gene for CMP-N-acetylneuraminic acid hydroxylase, partial cds 

Rattus norvegicus mRNA for connexin36 (cx36 gene) ~ — 

KING CANAL PROTEIN (KELCH PROTEIN) 

RING CANAL PROTEIN (KELCH PROTEIN) ' 


I I I 

£ S * 
1 l l 

5 9 S 
5 z | | ; 

8 S m . "i a 
° i § 1 - 

s 1 f f S 

I § C "5L "SI I 

i 1 t f, 

I us i i r 

ytt 1 1 f. 

ScSo^H J | g § | 

I * § f g S J III 

-lliSz! f 5ji 

tSllIIi&ili'l 

II §SS IS i ^ 3 | * 
.5 Eft 5 Sx£8 ?I 


Homo sapiens chromosome 21 segment HS21C102 " 

UI-H-BI1-adi-g.10-0-UI.s1 NCI CGAP Sub3 Homo sapiens cDNA clone IMAGE-2717010 3' 

zxwiu.n faoaresjotal fetus Nb2HF8 9w Homo sapiens cDNA clone IMAGE7891 1 4 V 

W2183960T1 NIH_MGC_42 Homo sapiens cDNA clone IMAGE:43000703 ; 

Ksap,ens DMA, DMB, HLA-Z1, IPP2, LMP2, TAP1, LMP7, TAP2, DOB, DQB2 and RINGS, 9, 13 and 14 

Rattus norvegicus mRNA for SREB1 , complete cds ' 

UNV POLYPROTfcIN [CONTAINS: COAT PROTEIN GP52; COAT PROTEIN GP36] 


3rosophila melanogaster shortsighted class 2 (shs) mRNA, complete cds ' 

Droscphila melanogaster shortsighted class 2 (shs) mRNA, complete cds 

Ho™ sapiens calc.um channel alphalE subunit (CACNA1E) gene, exons 7-49, and partial cds, alternatively 

ra45e02.s1 Scares adult brain N2b4HB5bY Homo sapiens cDNA clone IMAGE180890 3' " 

-amelus dromedarius cvh P 19 gene for immunoglobulin heavy chain variable reaion ' 

Homo sapiens extracellular glycoprotein lacritin precursor, gene, complete cds 

Homo sap.ens extracellular glycoprotein lacritin precursor, aene. comply ' 


Page 
gle Exon Proi 


Top Hit 
Database 

NT 

SWISSPROT 
SWISSPROT 


1 1 EI 

i, | | i 


NT 

EST_HUMAN 
EST HUMAN 
EST HUMAN 

NT 

SWISSPROT 


§ 

X 

. z z z m z z z 


CO 


Top HitAcession 
No. 

AB009S68.1 

AJ206282.1 

Q04652 

Q04652 

T70874.1 


M20783.1 

AA661605.1 

AF284446.1 

P48509 

4557836 
4557836 

P29400 

AA450138.1 

AF302691.1 


AL163302.2 
AW137782.1 
AA450138.1 
BF568955.1 

X87344.1 

AB040802.1 

=03374 


5 5 | ^ i II 




Most Similar 
(Top) Hit 
BLAST E 
Value 

3.0E-03 
3.0E-03 
2.0E-03 
2.0E-03 
2.0E-03 


2.0E-03 
2.0E-03 
2.0E-03 

2.0E-03 

2.0E-03 

2.0E-03 
2.0E-03 
2.0E-03 
2.0E-03I 


2.0E-03 
2.0E-03 
2.0E-03 
2.0E-03 

2.0E-03 
2.0E-03 
2.0E-03 


2.0E-03 1 
2.0E-03 1 

2.0E-03 / 
2.0E-03 F 
2.0E-03 t 
2.0E-03 / 
2.0E-03 / 




Expression 
Signal 

1.3E 
2.33 
0.8S 
0.89 
7.01 


1.59 
12.6 

1.91 

1.91 
4.94 
1.38 
2.17 


1.04 
4.62 
4.03 
1.02 

4.87 
0.92 
2.15 
9.77 


q | | S | q ° 




So Hi 

" Z "° ° 


1104 

1115 

1243 

1263 

264 1 

1502 
1731 


"088 
093 

315 
572 
703 


130 

285 

820 ' 
821 1 




O 




<N S <N ci oj Si 






SEQ ID 
NO: 

1966E 

10446 
12677 


11246 
11249 

11379 

11404 

11404 
11477 
11634 
11845 


12409 
13289 
13294 

13529 
13780 
13927 
14027 


14340 
14340 

1449S 
14500 
14947 
15057 
15057 




Probe 
SEQ ID 

NO: 

9159 

504 
504 
768 


1340 
1343 
1353 

1474 

1500 

1500 
1573 
1733 
1950 


2535 
3370 
3376 

3615 

4024 
4127 


4446 
4446 

4612 
5077 
5194 
5194 
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Top Hit Descriptor 


e01876385F1 NIH_MGC_55 Homo sapiens cDNA clone IMAGE:4104692 5' 


Homo sapiens mRNA for KIAA0693 protein, partial cds ! 


Xenopus laevis xefiltin mRNA, complete cds 


ATP-DEPENDENT NUCLEASE SUBUNIT B 


ATP-DEPENDENT NUCLEASE SUBUNIT B 


CARBONIC ANHYDRASE-RELATED PROTEIN 2 PRECURSOR (CARP 2) (CA-RP II) (CA-XI) 


CARBONIC ANHYDRASE-RELATED PROTEIN 2 PRECURSOR (CARP 2) (CA-RP II) (CA-XI) | 


601887434F1 NIH_MGC_17 Homo sapiens cDNA clone IMAGE:41 21 408 5' 


ADAM-TS 7 PRECURSOR (A DISINTEGRIN AND METALLOPROTEINASE WITH THROMBOSPONDIN 
MOTIFS 7) (ADAMTS-7) (ADAM-TS7) 


L.esculentum mRNA for lysyi-tRNA synthetase (LysRS) 


v/u36h09.x1 Soares Dieckgraefe colon NHCD Homo sapiens cDNA clone IMAGE:25221 77 3' similar to 
SW:RL29 HUMAN P47914 60S RIBOSOMAL PROTEIN L29 ;contains element MSR1 repetitive element ; 


CM4-BT0366-031299-054-d01 BT03BS Homo sapiens cDNA 


hf37b06.x1 Soares NFL T GBC S1 Homo sapiens cDNA clone 1MAGE:2934035 3' similar to TR:Q60976 I 
Q60976 JERKY. ; 


yx42g06.s1 Soares melanocyte 2NbHM Homo sapiens cDNA clone IMAGE:264442 3' similar to contains I 
L1.b2 L1 repetitive element ; 


yx42g06.s1 Soares melanocyte 2NbHM Homo sapiens cDNA clone IMAGE:264442 3' similar to contains I 
L1 .b2 L1 repetitive element ; j 


TENASCIN PRECURSOR (TN) (HEXABRACHION) (CYTOTACTIN) (NEURONECTIN) (GMEM) (Jl) 
(VIOTENDINOUS ANTIGEN) (GLIOMA-ASSOCIATED-EXTRACELLULAR MATRIX ANTIGEN) (GP 1 50- 
225){TENASCIN-C) (TN-C) 


zs1 0a06.s1 NCI_CGAP_GCB1 Homo sapiens cDNA clone IMAGE:684754 3' 


Human dystrophin gene 


PROTEOGLYCAN LINK PROTEIN PRECURSOR (CARTILAGE LINK PROTEIN) (LP) 


RC3-BT0333-310800-115-g04 BT0333 Homo sapiens cDNA I 


H.sapiens variable number tandem repeat (VNTR) locus DNA I 


ty65h03.x1 NCI_CGAP_Kid11 Homo sapiens cDNA clone IMAGE:2283989 3' similar to SW:VATG_MANSE I 
Q25532 VACUOLAR ATP SYNTHASE SUBUNIT G ; \ 


Homo sapiens SEL1L (3EL1L) gene, partial cds | 


Camelus dromedarius cvhp19 genefor immunoglobulin heavy chain variable region | 


AV697966 GKC Homo sapiens cDNA clone GKCGXD05 5' | 


imp 

tU 








O 


PROT 
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1 


I 


5PROT 




lUMAN 


UMAN 


lUMAN 


1 


lUMAN | 


5PROT 


z 

2 
D 




PROT ! 


UMAN 




lUMAN 






UMAN j 




I 


2 




% 


% 


SWISS 
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X l 




Z 






!EST h 
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§ 
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& 
% 


ESTJ- 




;EST h 


z 




ESTJ- 


Top Hit Aoession 
No. 


BF241410.1 f 


AB014593.1 | 


U63711.1 ! 


P23477 | 


P23477 [ 


Q95203 i 


Q95203 ! 


BF308187.1 ; 


|q9UKP4 




1 


1 

S 


AW692004.1 | 


z 


|n20287.1 I 


P24821 


AA251 376.1 | 


M86524.1 | 


P07354 | 


BF330909.1 ! 


i 


AI625745.1 \ 


< 


AJ245167.1 i 


i 

?! 

1 




S 




3 


s 




3 




s 






9 


3 




3 
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3 


3 


S 




3 


3 
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o -—en 






























































¥j 














8j 
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% 


jjj 


1 = 

i 3 




















































ORFSEQ 
ID NO: 


25128 






25725| 


25726 


25846 


25847| 


25849 


| 26874 






S 
8 


! 26763 1 


1 

s 


26854| 


27508 






26520 1 




1 










SEQ ID 

NO: 


1 


| 




15624| 


15624j 


1 






| 15756 


15767| 




1 


§ 




1 


s 


§ 




s 
s 


I 








14947 | 


s 


SEQ ID 

NO: 


5372 






571 6 [ 


571 6 [ 


I 


5829 j 




| 5850 [ 


1 


I 




I 


1 6784I 


| 6784) 


7513 


7775| 


8379 [ 


8779 1 


s 


1 






9113| 


9322 1 
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Top Hit Descriptor 

H. sapiens M1 gene for muscarinic acetylcholine receptor 


Homo sapiens chromosome 21 segment HS21C003 " " — 

:a65f02.x1 Soares_total_fetus_Nb2HF8_9vv Homo sapiens cDNA clone IMAGE:204S051 3' similar to 
contains Alu repetitive element; 


Homo sapiens MSH55 gene, partial cds; and CLIC1, DDAH, G6b,G6c,G5b G6d G6e G6f BAT5 G5b 
CSK2B, BAT4, G4, Apo M, BATS, BAT2, AIF-1 , 1 C7, LST-1 , LTB, TNF, and LTAgenes, complete cds ' 
AV697965 GKC Homo sapiens cDNA clone GKCGXD05 5' — 


yt98c08.M Soares_p[neal_gland N3HPG Homo sapiens cDNA clone IMAGE 232334 5' "~ 

as70b08.x1 Barstead colon HPLRB7 Homo sapiens cDNA clone IMAGE:2334039 3' similar to TRQ13825 
Q13825 AU-BINDING PROTEIN/ENOYL-COA HYDRATASE ■ 


as70b08.x1 Barstead colon HPLRB7 Homo sapiens cDNA clone IMAGE:2334039 3' similar to TRQ13825 
Q 1 3825 AU-BINDING PROTEIN/ENOYL-COA HYDRATASE - 

Wk86a06.x1 NCI CGAP Pan1 Homo sapiens cDNA clone IMAGE-2422258 3 1 


Wx93e10.x1 NCI_CGAP_Mel15 Homo sapiens cDNA clone IMAGE:2551242 3' 

wd86a01 ,x1 NCI_CGAP_Lu24 Homo sapiens cDNA clone IMAGE:2338440 3' similar to contains Alu 

repetitive element; 

HIGH MOLECULAR WEIGHT FORM OF MYOSIN I (HMWMI) 


Homo sapiens mRNA for KIAA1291 protein, partial cds ■ 

CARBONIC ANHYDRASE VI PRECURSOR (CARBONATE DEHYDRATASE VI) (CA-VI) (SECRETED 
CARBONIC ANHYDRASE) (SALIVARY CARBONIC ANHYDRASE) 


CARBONIC ANHYDRASE VI PRECURSOR (CARBONATE DEHYDRATASE VI) (CA-VI) (SECRETED 
CARBONIC ANHYDRASE) (SALIVARY CARBONIC ANHYDRASE) 

UNh-1 REVERSE TRANSCRIPTASE HOMOLOG 

Human MUC2 gene, promoter region ' " ■ 

iuman MUC2 gene, promoter region ~ : — 


-lomo sapiens SVMT gene for synaptic vesicle monoamine transporter, exons 14, 15 

s.cerevisiae chromosome X reading frame ORF YJR149w 

=SC1-TN0128-160800-021-g01 TN0128 Homo sapiens cDNA ' 

FCBAP1 D4909 Pediatric pre-B cell acute lymphoblastic leukemia Baylor-HGSC project=TCBA Homo 
=apiens cDNA clone TCBAP4909 

.-aenorhabditis elegans spliced leader RNA (SL3 alpha), (SL4), and (SL5) aenes 


Top Hit 
Database 
Source 

NT 


1 
I 


NT 

EST HUMAN 


jj 


ESTJHUMAN 
EST_HUMAN 


ESTJHUMAN 

EST HUMAN 
SWISSPROT 


NT 

SWISSPROT 


SWISSPROT 

SWISSPROT 

NT 

NT 


z z 
i I 

X I 
Z Z UJ UJ Z 


HitAcession 
No. 

508.1 


33203.2 
5037.1 


29756.1 
97966.1 


471.1 
0263. ,1 


AI720263.1 
AI865788.1 


4572.1 

2616.1 
08 








? is 


< < 


E § 
< < 


s s 

r < 


s i § s 

< < 2. < 


1 § 




lis Si 


Most Similar 
(Top) Hit 
BLAST E 
Value 

2.0E-03 


2.0E-03 
2.0E-03 


2.0E-03 
2.0E-03 


1. OE-03 
1. OE-03 


1. OE-03 
1. OE-03 


1. OE-03 

1. OE-03 
1. OE-03 


2 2 


1. OE-03 
1. OE-03 
1. OE-03 
1. OE-03 


1.0E-03 , 
1. OE-03 ; 
1. OE-03 I 

1. OE-03 |l 


Expression 

1.59 
9.62 




En 8 


2.38! 




3.61 




2.26 
1.16 
0.79 
0.79 


1.02 

3.54 

3.77 
0.87 


ORFSEQ 
ID NO: 

25278 






3 fe 


20588 
20836 




g R 


llll 


23559 
24013 

24054 
24244 


SEQ ID 

NO: 

19066 
1S087 


I 1 


19247 


10740 


10740 
10995 


11063 
11997 


12878 
13079 


13079 
13185 

13413 


13766 
14231 

14273 
14456 


Probe 
SEQID 

9414 
9455 


i I 


1 1 I 




812 
1079 


1150 
1982 
2108 


i ! 


"""" 


3855 
4334 

4377 
4564 
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1 

] 

c 

! 


• 


ov45c04 j<1 Soares_testis_NHT Homo sapiens cDNA clone IMAGE:1 640262 3' | 


ov45c04.x1 Soares_testis_NHT Homo sapiens cDNA clone IMAGE:1640262 3' | 


: PMO-HT0339-200400-01 0-D02 HT0339 Homo sapiens cDN A | 


lAPOLIPOPROTElN A-IV PRECURSOR (APO-AIV) j 


i AV685870 GKC Homo sapiens cDNA clone GKCDME1 1 5' I 


zs44f01 .r1 NCI_CGAP_GCB1 Homo sapiens cDNA clone IMAGE:700345 5' \ 


;Homo sapiens KVLQT1 gene | 


Epstein-Barr virus (AG876 isolate) U2-IR2 domain encoding nuclear protein EBNA2, complete cds | 


Epstein-Barr virus (AG87S isolate) U2-IR2 domain encoding nuclear protein EBNA2, complete cds j 


COLLAGEN ALPHA 1(VII) CHAIN PRECURSOR (LONG-CHAIN COLLAGEN) (LC COLLAGEN) J 


1 
1 

1 
2 


jHomo sapiens T-cell lymphoma invasion and metastasis 1 (TIAM1), mRNA | 


|yd93a11.r1 Scares fetal liver spleen 1NFLS Homo sapiens cDNA clone IMAGE:115772 5' i 


QV3-NN1024-260400-171-g05 NN1024 Homo sapiens cDNA | 


|Human gene for fourth somatostatin receptor subtype | 


IHomo sapiens 959 kb contig between AML1 and CBR1 on chromosome 21q22, segment 2/3 I 


Homo sapiens X28 region near ALD locus containing dual specificity phosphatase 9 (DUSP9), ribosomal 
protein L18a (RPL18a), Ca2+/Calmodulin-dependent protein kinase I (CAMKI), creatine transporter (CRTR), 
CDM protein (CDM), adrenoleukodystrophy protein > 


|Human TRPM-2 protein gene, exons 1 ,2 and 3 | 


jHomo sapiens partial steerin-1 gene j 


iHomo sapiens exostoses-like protein 1 (EXTL1) gene, exons 2 through 1 1 , and complete cds | 


| V.carteri gene encoding volvoxopsin j 


jHuman class III alcohol dehydrogenase (ADH5) chi subunit mRNA, complete cds | 


jHuman class III alcohol dehydrogenase (ADH5) chi subunit mRNA, complete cds j 


jThermotoga neapolitana alpha-1 ,6-galactosidase (agIA) gene, complete cds i 


IThermotoga neapolitana alpha-1 ,6-galactosidase (agIA) gene, complete cds | 


!RC1-CT0279-181099-011-a09 CT0279 Homo sapiens cDNA j 


jRC1-CT0279-181099-011-a09 CT0279 Homo sapiens cDNA J 


jQV3-HT0543-220300-130-a03 HT0543 Homo sapiens cDNA | 


U73e12.x1 NCI CGAP_HSC3 Homo sapiens cDNA clone IMAGE:2246446 3' similar to TR:Q26195 Q26195 1 
|PVA1 GENE. ; ] 


|AV759949 MDS Homo sapiens cDNA clone MDSDDF1 1 5' [ 


j601433087F1 NIH_MGC_72 Homo sapiens cDNA clone IMAGE:391 8524 5' | 


Top Hit 


8 

I 


EST HUMAN | 


|EST HUMAN | 


iEST HUMAN | 


ISWISSPROT | 


;EST HUMAN ] 


IEST HUMAN | 


!NT I 


'NT | 


NT I 


O 
X. 

I 


INT | 


!nt I 


z 

1 

D 
I 


X 


NT | 


nt I 


NT 


:nt I 


z 


:nt I 


NT 


,NT I 


NT I 


|NT | 


z 


1 

3 

I 


EST_HUMAN { 


EST HUMAN | 


EST HUMAN | 


!EST HUMAN j 


;est_human j 


Top Hit Acession 
No. 


IAI073485.1 ! 


IAI073485.1 . | 


1 


1 

o 


;AV685870.1 


IAA290951.1 i 


IAJ006345.1 ! 


| 


IK03332.1 | 


SQ02388 | 


|X07699.1 | 


11 5261 76 | 


i 


AW902585.1 I 


Q 


(AJ229042.1 J 


U52111.2 


jM63376.1 


jAJ251973.1 


;AF153980.1 | 


SY11204.1 '| 


|M30471.1 


i 


jAF011400.1 


< 


]AW362393.1 j 


% 
< 


|BE1 70859.1 


IaI583847.1 


|AV759949.1 


I 

! 


Most Similar 
(Top) Hit 


BLAST E 
Value 




.OE-03 1 


.OE-03 1 


.OE-03 1 


.OE-03; 




.OE-03 1 


.OE-03 1 


? 


3 


.0E-D3I 


.OE-03 1 


3 


.OE-03I 


.0E-O3| 


.OE-03 1 


.OE-03 


.OE-03 1 


.OE-03! 


.OE-03! 


.OE-03! 


.OE-03I 


! £0-30' 


-OE-03 


-OE-03: 


! eo-30' 


.OE-03 1 


3 


.OE-03 


SO-30- 


.OE-03 
































































Expression 
Signal 












2.03 i 




1 2.01| 








s 














i 5.35! 






] 4.27| 












I 2.44 






! 2.S2 


ORFSEQ 


ID NO: 




I 




1 




I 


25042| 


l 


25097I 


1 






25842| 




S 




S3 




26707| 


I 






% 


S 


i 27603 1 




j 28197J 








| 291 13| 


SEQ ID 

NO: 


1 




I 




1 




1 


1 


1 


I 


1 5577 


1 


15730 


1 5763 


8 


1 


I 




i 




1 


8 


8 


1 


1 






1 


§ 


18370 


































































Probe 

SEQID 
NO: 


4717 


I 4717| 




| 5009[ 


| 5103| 


1 


| 5317| 


1 


? 
8 


1 54231 


ES 

93 


1 


| 5824 1 


I 5857 1 


1 


1 


6503 


1 


1 


1 


1 71751 


| 7273 | 


I 7273I 


s 


| 7541 1 


1 


8 


| 8133j 


1 


I 


I 
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Table 4 

bes Expressed in Heart 


Top Hit Descriptor 


. «^a,L,u„ udiiuuiiiuuun itsiatea protein 63 1 1 rp63), mRNA 

te05H1 1.x1 NCI_CGAP_Co16 Homo sapiens cDNA clone IMAGE:2063013 3' similar to 

repetitive element; 

601468878F1 NIH_MGC 67 Homo sapiens cDNA clone IMAGE:3872035 5' 
Human retinoblastoma susceptibility gene exons 1-27, comolete cds 

APOLIPOPROTEIN A-IV PRECURSOR (APO-AIV) 

Glyeyrrhiza glabra GgbASI mRNA for beta-amyrin synthase comDletp cris " 

X.laevis mRNA for C4SR protein 

yf12h10.r1 Scares fetal liver spleen 1NFLS Homo sapiens cDNA clone IMAGF-Ufificvi <= 
LINE-1 REVERSE TRANSCRIPTASE HOMOLOG 


1 H If m 

I I? s s ii IBs- 
Is if I 1 i H Hi 

Is <i< t 1 §ss^8° si- 
ll ps 1 isSff^l Us 

II III i<tHiifigJii 

Hi isflS f ilfsfffff l?|§ It 

i* II Iiilll Itiil 
fl:f #f i I f: |*i Hi I f #1 1 f 1 1 1 1 
i ii! p mi i ii in 
} iff nil iifiti ii i 


Page 
gle Exon Pro 


Top Hit 
Database 


EST_HUMAN 
EST HUMAN 
NT 

SWISSPROT 

NT 

NT 

EST HUMAN 
SWISSPROT 


11 I III I 551 5& 

55 5 555 5 s| s i g il 
.mfSz^^^fs s [3 a £ a £ £ t § 6 s t ?r § 


in 


i* 1 

,8- 


AI347355.1 

BE780572.1 

L11910.1 

P06727 

AB037203.1 

X96469.1 

R07008.1 

P08547 


AI571099.1 
L41 825.1 
U29185.1 
AL163210.2 

4885170 
AI769331.1 

U78027.1 
Z40561.1 
BE077941.1 
R1 7336.1 

6005855 
41862525.1 
<01315.1 
<01315.1 

J45983.1 1 
=46408 

*L048507.2 E 
3E005850.1 E 
U229042.1 t 
WV013847.1 E 
201768 J 




1*?, S 


■OE-03 
OE-03 
.OE-04 
.OE-04 
.OE-04 
■OE-04 
.OE-04 
■OE-04 


OE-04 
OE-04 
OE-04 
OE-04 
OE-04 
OE-04 
OE-04 

OE-04 
OE-04 
OE-04 
OE-04 
OE-04 
OE-04 
3E-04 
3E-04 
DE-04 i 
3E-04I 
JE-04 / 
E-04 "i 
E-04 / 
E-04 / 
E-04 1 C 




jlP ' 








Expression 
Signal 


2.16 
4.2 
0.81 

1.42 
1.39 
1.27 
42. 
2.39 


2.08 
2.1S 
1.61 
1.13 

1.03 
2.24 

3.2 
2.61 
3.29 
2.72 
3.28 
1.34 
0.85 
0.85 
3.28 
4.35 
3.13 
2.24 
2.07 
6.32 
2.45 




ORFSEQ 
ID NO: 

25287 


24904 
24769 

23548 
24332 


21562 
22127 
22437 
22965 

28973 

23576 
23680 
23681 
23769 

27887 
28881 

28950 




SEQ ID 
NO: 

19108 


19687 

14998 
15404 
17412 
11377 
13754 
13989 
14542| 


18483 

12230 
12547 
13166 
15828 

18663 
18683 
19155 
19300 
19321 
13788 
13905 
13905 
13992 
16470 
17583 
17650 
18593 
18664 
18714| 




Probe 
SEQ ID 
NO: 

9491 


9517 
9626 
5131 
5485 
7561 
1472 
3843 
4089 
4656 


8485 
8616 
1786 
2350 
2682 
3243 
5923 

8851 
8871 
9558 
9790 
9813 
3877 
3999 
3999 
4092 
6590 
7733 
7800 
8776 
8852 
8906| 
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Table 4 

bes Expressed in Heart 


i 
1 

5 
I 

1 § J? < 
1 li 1 

Q o a. a 
" I' I 

8- 1 1 1 

IS? 1 
sill 

f 1 3 i | 

ill 

■V o O V o 


•8 s 8 
ill ^ 2» S 

I ! 1. 1 M £ 

! J b li I I i i * I 

i ii is . 1 1 li i i 

I SS ig fell « R b I s I l 

1 |f I? 1*1 llllis 8 f ! 

2 fl ^ M I Its 
I S| ll ill Sp^JjjU | { jj 

I I if 11 1 11 l b II! 

i i i! 1 1 im i j fit 

1 IfllllU Ills I^lllli 1 
f If If Illi illlf 1 1 II 


Pag 

ngle Exon Pre 
1 


1 Top Hit 
Database 
Source 

EST_HUMAN 
7 NT 
SWISSPROT 
EST_HUMAN 




CO 


.2 jo 

£ « 8| 
^ < n S 
S * * -* 


1 Pi 1 1 1 1 i i f I g - i I I 




ff£ g S3SS 
§ i§"| 1 ^ § s S 


5.0E-0< 
5.0E-04 

5.0E-04 
5.0E-04 
5.0E-04 
1 5.0E-04 

5.0E-04 
5.OE-04 
5.0E-04 
4.0E-04 
4.0E-04 

4.0E-04 

4.0E-04 1 
4.0E-04 / 
4.0E-04 f 
4.0E-04 / 
4.0E-04 C 
4.0E-04 A 
4.OE-04 A 

4.OE-O4U 




5 Expression 
Signal 

3 1.4! 
> 5.8: 
la: 


2.1S 
5.72 
13.6 

5.57 
1.45 

4.17 

17.6 
2.27 
1.46 
1.13 
1.42 

1.57 

1.57 
3.92 
1.68 
1.19 
2.73 

0.84 

2.75, 




. ORFSEC 
ID NO: 

25171 
20381 


25113 
26026 

26710 
27412 
27606 

25113 

20410 

20606 

20607 
21218 
21823 

22348 
22850 
23041 

23900 




Jag ilsi § 


15281 
15901 
16227 

16519 
17213 
17394 
18220 

15281 
19530 
19413 
10333 
10592 

10756 

10756 
11354 
11928 
11976 
12457 

3052 

3237 

4126| 




Probe 
SEQID 

NO: 

9225 
9958 
635 
1485 


5361 
5996 

6639 
7345 
7543 
8343 

8950 
9164 
9944 

386 

657 

829 
1449 
2037 
2086 
2586 
3127 
3316 ' 

4228 -1 
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Top Hit Descriptor 


|nh10a10.s1 NCI_CGAP_Co1 Homo sapiens cDNA clone IMAGE:951 930 3' similar to gb:M21 121 T-CELL 
IsPECIFIC RANTES PROTEIN PRECURSOR (HUMAN); 


|zn61c08.s1 Stratagene muscle 937209 Homo sapiens cDNA clone IMAGE:562670 3' | 


|601345895F1 NIH_MGC_8 Homo sapiens cDNA clone IMAGE:3678910 5' | 


|Vy78b10.s1 Soares_multiple_sclerosis_2NbHMSP Homo sapiens cDNA clone IMAGE:279643 3' similar to I 
(contains Alu repetitive element; j 


jyx39e12.r1 Soares melanocyte 2NbHM Homo sapiens cDNA clone IMAGE:264142 5' [ 


;ov87h03.s1 Soares Jestis NHT Homo sapiens cDNA clone IMAGE:1644341 3' | 


1 
a 

1 

"8 
.y 

I 

1 
z 

1 

£ 
1 

1 
1 


jHomo sapiens SMARCA4 isoform (SMARCA4) gene, complete cds, alternatively spliced I 


jFORMIN (LIMB DEFORMITY PROTEIN) | 


Q 

S 

1 

l 
f 

| 

I 
1 
1 

i 

D 


i 1 80 KD SECRETORY PHOSPHOLIPASE A2 RECEPTOR PRECURSOR (PLA2-R) j 


(Human short chain acyl CoA dehydrogenase gene,, exons 1 and 2 j 


8 

< 
i 

1 
1 

D 

■a 

1 
i 

< 

u 


INTERNALIN B PRECURSOR | 


! GLUTAMATE DEHYDROGENASE 2 PRECURSOR (GDH) j 


Homo sapiens Xq pseudoautosomal region; segment 1/2 I 


PM0-HT0339-19020D-007-g12 HT0339 Homo sapiens cDNA • | 


jQV3-DT0045-221299-046-d09 DT0045 Homo sapiens cDNA f 


, V.carteri gene encoding volvoxopsin I 


; Homo sapiens chromosome 21 segment HS21 C081 | 


Homo sapiens chromosome 21 segment HS21 C078 j 


FIBROBLAST GROWTH FACTOR RECEPTOR 3 PRECURSOR (FGFR-3) | 


zx48d08.M Soares_testis_NHT Homo sapiens cDNA clone IMAGE:795471 5' similar to gb:M62762 I 
VACUOLAR ATP SYNTHASE 16 KD PROTEOLIPID SUBUNIT (HUMAN); [ 


iaj24g05.s1 Soaresjestis NHT Homo sapiens cDNA clone 1391288 3' similar to gb:M36072 60S j 
RIBOSOMAL PROTEIN L7A (HUMAN); | 


nc38e04.r1 NCi_CGAP_Pr2 Homo sapiens cDNA clone IMAGE:1010430 similar to contains L1.t2 L1 j 
repetitive element ; | 


iHomo sapiens mRNA for KIAA0749 protein, partial cds | 


DKFZp547L1 85_r1 547 (synonym: hfbri) Homo sapiens cDNA clone DKFZp547L185 5' | 


Homo sapiens SCG10 like-protein, helicase-like protein NHL, M68, and ADP-ribosytation factor related I 
protein 1 (ARFRP1 ) genes, complete cds | 




2 

1 


z 
5 


IUMAN j 


z 

1 


| 


| 






1 




5PROT I 




z 

I 


>PROT | 


SPROT | 




IUMAN | 










O 
a: 


z 

I 
r> 


IUMAN | 


IUMAN | 




IUMAN | 






EST h 


EST h 


EST I- 


EST h 


EST h 


EST F 




z 


SWISS 


EST F 


& 
s 


z 


EST h 


SWISS 


£9 




EST h 


EST F 








£2 
w 


EST h 


EST h 


EST h 


z 


EST h 


z 


Top Hit Acession 
No. 


|aA578331.1 j 


|AA086324.1 | 


I 
1 


|n48313.1 S 


|N25507.1 j 


< 


1 


§ 
< 


|Q05360 | 


|AL1 19426.1 j 


|P49259 ! 


B 


< 


[P25147 j 


|P49448 j 


|AJ271735.1 \ 


|BE1 53778.1 j 


|AW937723.1 | 


1 


|AL163281.2 \ 


| AL1 63278.2 j 


jP22607 | 


AA454055.1 | 


AA781201.1 | 


AA228301.1 | 


IAB018292.1 | 


1 
< 


AF21 7796.1 | 


lib • 




OE-04 


OE-04! 


s 


OE-04| 




s 


OE-04j 


0E-O4| 


0E-O4j 


OE-04 


S 


OE-04 


OE-04! 


S 


OE-04f 


OE-04! 


OE-04! 


§ 


OE-04j 


s 


S 


s 


OE-O4I 


OE-04 


OE-04 


OE-04 


OE-04| 


























ri 


































Expression 




$ 


1 3.721 




I 1.«2| 








s 










I 3.37! 














1 
















ORFSEQ 
ID NO: 


| 23901 i 


I 24121, 


| 24661 


| 24770| 


| 27040| 


s 










] 19977| 


| 20639| 




1 


] 23581 j 






s 
i 






8 


! 26875| 


27780 | 


§ 








fe 
ro 


SEQ ID 

NO: 


14126 


1 4332 


H893 


! 


I 


I 


I7512| 




1 


,0124 


8 






8 


S 


1 3876) 




1 


I4924| 


1 


I 




1 






19576! 


1 


10141 J 


























































Probe 
SEQ ID 
NO: 


1 


| 4437| 


1 


I 5132| 


§ 


I 7588| 


1 


1 


1 


lost | 




8 


| 1810| 


1 


| 3882| 


1 


I 4711 | 


| 4784| 


| 5052| 


| 5735| 


1 


1 


| 7705| 


1 7977| 




1 


1 
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Top Hit Descriptor 


|AU146707 HEMBB1 Homo sapiens cDNA clone HEMBB1 001 253 3' j 


j Human dystrophin gene j 


]Human dystrophin gene i 


qh98e1 1 .x1 Soares_NFL_T_GBC_S1 Homo sapiens cDNA clone IMAGE:18550523' similar to contains I 
|MER3.b2 MER3 repetitive element ; j 


jHomo sapiens chromosome 21 segment HS21C003 | 


Mus musculus 5' flanking region of PitxS gene \ 


Izu39b05.s1 Scares ovary tumor NbHOT Homo sapiens cDNA clone IMAGE:740337 3' similar to contains Alul 
j repetitive element; j 


Human germline T-cell receptor beta chain TCRBV17S1A1T, TCRBV2S1, TCRBV10S1P, TCRBV29S1P, 
TCRBV19S1P, TCRBV15S1, TCRBV11S1A1T, HVB relic, TCRBV28S1P, TCRBV34S1, TCRBV14S1, 
TCRBV3S1, TCRBV4S1A1T, TRY4, TRY5, TRY6, TRY7, TRY8, TCRBD1, TCRBJ1S1, TCRBJ1S2,> 


] am58c09.x1 Johnston frontal cortex Homo sapiens cDNA clone IMAGE:1 539760 3' j 


|QV2-BT0636-070500-194-b07 BT0636 Homo sapiens cDNA I 


j Human tyrosine kinase TXK (txk) gene, exons 9 and 10 | 


EST390550 MAGE resequences, MAGP Homo sapiens cDNA | 


Phaseolus vulgaris nitrate reductase (PVNR2) gene, complete cds [ 


i 

a 
1 

1 

i 

1 
| 
1 
I 

1 
f 

i 

! 


1 

1 
< 

f 

(3 
O- 

1 

1 

i 
1 


j Gallus gallus proteasome 28 kDa subunit homolog mRNA, complete cds I 


j Danio rerio hagoromo gene, exons 1 to 6, partial cds j 


j AV654352 GLC Homo sapiens cDNA clone GLCDUH10 3' j 


jtq03b1 1 .x1 NCI_CGAP_Ut3 Homo sapiens cDNA clone IMAGE:2207709 3' j 


JAU121712 MAMMA 1 Homo sapiens cDNA clone MAMMA1 000798 5' j 


j LINE-1 REVERSE TRANSCRIPTASE HOMOLOG j 


IMYOMESIN 2 (M-PROTEIN) (165 KD TITIN-ASSOCIATED PROTEIN) (165 KD CONNECTIN- I 
ASSOCIATED PROTEIN) i 


jSolanum lycopersicum phytochrome F (PHYF) gene, partial cds | 


jSolanum lycopersicum phytochrome F (PHYF) gene, partial cds | 


Homo sapiens DNA, DLEC1 to ORCTL4 gene region, section 1/2 (DLEC1, ORCTL3, ORCTL4 genes, I 
{complete cds) [ 


Homo sapiens DNA, DLEC1 to ORCTL4 gene region, section 1/2 (DLEC1, ORCTL3, ORCTL4 genes, I 
j complete cds) j 


Top Hit 
Database 


z 

i 
=> 


z 


z 


EST HUMAN | 


z 




EST_HUMAN I 




|EST HUMAN | 


;est human S 




iEST HUMAN | 


z 


|EST HUMAN j 


iEST HUMAN j 






( EST HUMAN j 


IEST HUMAN | 


jEST_HUMAN | 


ISWISSPROT 


ISWISSPROT | 


z 






z 


Top Hit Acession 
No. 


:AU1 46707.1 


M86524.1 | 


1 


AI286021.1 j 


I 
S 


AF224268.1 | 


1 
I 


l 


AH 24529.1 J 


IBE082317.1 I 


U34374.1 j 


i 

< 


§ 
s 


H96265.1 | 


H96265.1 


1 
§ 


AB037997.1 | 


AV654352.1 j 


§ 
< 


< 




E 


U32444.2 


U32444.2 i 


i 
< 


AB026898.1 


Most Similar 
(Top) Hit 
BLAST E 
Value 


I 2.0E-04| 




| 2.0E-04| 




| 2.0E-04| 


| 2.0E-04i 


| 2.OE-O4I 


2.0E-04 


I 


3 


| 2.0E-04| 


1 


1 2.0E-04I 


| 2.0E-04| 


| 2.0E-04| 


i 2.0E-04I 




1 


j 2.0E-04| 


[ 2.0E-04| 


3 


l 








2.OE-O4I 


Expression 






I 


















s 






























ORF SEQ 
ID NO: 


j 20232| 














22293 


1 




8 


s 




24249 | 


24250] 


















g 




2 
8 


SEQ ID 
NO: 


j 104131 


I 


! 1081 5| 








12024| 


12402 


1 


I3307 


1 3334 


■3748 




1 


i 


14575 


1 


15322 




16155 




16356 


1 


16518! 


§ 


1 








































Probe 
SEQ ID 
NO: 








S 


1 




| • 2i3e| 




I 


1 


| 3417| 


| 3837| 


1 


1 


! 


! 


1 


I 




1 


1 


1 






1 


i 
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STATIN 8) | 
















1 








sntainsL1.t2L1 j 


antains element j 




36f, BAT5, G5b, 
complete cds 












Top Hit Descriptor 


Homo sapiens KIAA0237 gene product (KIAA0237), mRNA 


,qx62h04.x1 NCLCGAP_GC4 Homo sapiens cDNA clone IMAGE:2005975 3' 


S.cerevisiae chromosome VII reading frame ORF YGL038c 


LINE-1 REVERSE TRANSCRIPTASE HOMOLOG 


1 

| 

1 
1 

o 

1 

o 
z: 

1 


1 
1 

o 
| 

1 

i 

o 

1 
o 

z 

1 


Iwf26e08.x1 Soares_NFL_T_GBC_S1 Homo sapiens cDNA clone 1MAGE:2356742 3' 


CYSTATIN-RELATED EPIDIDYMAL SPERMATOGENIC PROTEIN PRECURSOR (CY 


Homo sapiens phospholipid soramblase 1 (PLSCR1), mRNA 


LINE-1 REVERSE TRANSCRIPTASE HOMOLOG 


Mouse alpha leukocyte interferon gene, complete cds 


Homo sapiens mRNA for KIAA1 142 protein, partial cds 


CMO-CT0404-130700-475-h03 CT0404 Homo sapiens cDNA 


CM2-NN0010-22030D-124-d08 NNODIOHomo sapiens cDNA 


ah45c11.s1 Soares_testis_NHT Homo sapiens cDNA clone 1292468 3' 


iwi54c11.x1 NCI_CGAP_Co16 Homo sapiens cDNA clone IMAGE:2394068 3' similar to ci 
MER6 repetitive element ; 


PROLYL 4-HYDROXYLASE ALPHA-2 SUBUNIT PRECURSOR 


Homo sapiens gene for cholecystokinin type-A receptor, complete cds 


Homo sapiens methyl-CpG binding protein 1 (MBDI)gene, exon 15b 


xa34g05.x1 NCI_CGAP_Br18 Homo sapiens cDNA clone IMAGE.2568728 3' similar to c! 
repetitive element ; 


qv23f06.x1 NCI_CGAP_Lym6 Homo sapiens cDNA clone IMAGE: 1982435 3' similar tocc 
MIR repetitive element ; 


i PROLYL 4-HYDROXYLASE ALPHA-2 SUBUNIT PRECURSOR 


ICSK2B, BAT4, G4, Apo M, BAT3, BAT2, AIF-1, 1C7, LST-1 , LTB, TNF, and LTA genes, 


Pisum sativum mRNA for beta-1 ,3 glucanase (gns2 gene) 


I 

1 
1 

I 

I 

1 
< 

E 
E 
I 
1 
E 
?o 
(I 


Human platelet-derived growth factor A chain (PDGFA) gene, exons only 


HSPD16734 HM3 Homo sapiens cDNA clone S3000003H04 


Human haptoglobin and haptoglobin-related protein (HP and HPR) genes, complete cds 


>p Hit 




z 
S 




SPROT | 


z 

I 




z 
| 


iPROT | 




l 






IUMAN | 






IUMAN | 


SPROT | 






z 

1 


IUMAN | 


O 
OC 
















INT 


EST_h 


IN 


3SIMSI 


EST.H 


EST F 


lEST F 


5S1AAS 


NT 


3SIAASI 


NT 


NT 


lEST F 


X 


EST F 


ESTJ- 


SWISS 


NT 


NT 


X, 




;siMSi 


NT 


NT 


NT 






NT 


Top Hit Aoession 
No. 


| 7662015| 


|AI357156.1 ; 


jZ72560.1 1 


!P08547 | 


1 
< 


|AI251 980.1 | 


1A1806220.1 i 


§ 

a 

§ 




jP08547 | 


M285S7. 1 I 


|AB032968.1 | 




|AW893325.1 | 




< 


§ 

a 


Q 


| AF1 20982.1 


AW073078.1 | 


i 

1 

< 


a 


AF1 29756.1 


|AJ251646.1 


1 
< 


JM83575.1 ; 


JF28172.1 | 


jM69197.1 | 




2 




1 .OE-04 






I .OE-04 




3 


! .OE-04 




1 .OE-04 


! .OE-04 




I .OE-04 


2 


J.OE-05 


3 


ec-30'e 


i.OE-05 


SO-30* 


i.OE-05 


J.OE-05 




5.0E-05 


3 


SC-30'S 


9C-30'S 


J.OE-05 
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Expression 
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14.47| 


s 




1.751 
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ORF SEQ 
ID NO: 


| 24672 1 
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| 276281 






§ 








23678| 










| 28705 1 
















:Q ID 
vIO: 


1 


1 


1501 5| 


15487i 


161291 


I6129| 






17413 


1 




18721 
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1 


I3902 




1 7264 


1 


1 8352 


18436 


15543 


1 
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I2847I 




I 




























































Probe 
SEQ ID 
NO: 


| 5028 


| 5033 


| 5148 




s 


1 


| 7380 


| 7384 


I 7562 I 


I 


1 


1 




1 


1 


s 






| 7457| 
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I 
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Top Hit Descriptor 


zs88h01 ,s1 NC1_CGAP_GCB1 Homo sapiens cDNA clone IMAGE:704593 3' similar to contains Alu I 
J repetitive elemenr;contains element MSR1 repetitive element ; j 


IRC3-CT0208-220999-011-E04 CT0208 Homo sapiens cDNA j 


j RC3-CT0208-220999-0 1 1 -E04 CT0208 Homo sapiens cDNA j 


|HUM072014F Human fovea cDNA Homo sapiens cDNA clone EST HFD072014 | 


|HUM072014F Human fovea cDNA Homo sapiens cDNA clone EST HFD072014 | 


I PROBABLE GLYCEROL-3-PHOSPHATE ACYLTRANSFERASE, MITOCHONDRIAL PRECURSOR j 
(GPAT) | 


jHcmo sapiens chromosome 21 segment HS21C078 1 


Dictyostelium discoideum gene for TRFA, complete cds | 


IHomo sapiens chromosome 21 segment HS21C001 j 


j Caenorhabditis elegans Skp1 p homolog mRNA, complete cds j 


Rat cytomegalovirus Maastricht, complete genome | 


!EST78713 Placenta I Homo sapiens cDNA j 


; EST04984 Fetal brain, Stratagene (cat#936206) Homo sapiens cDNA clone HFBED60 j 


IHomo sapiens sarcoglycan, epsilon (SGCE), mRNA | 


jHcmo sapiens chromosome X open reading frame6 (CXORF6) mRNA I 


Wb54h06.x1 NCI_CGAP_GC6 Homo sapiens cDNA clone IMAGE:2309531 3' similar to gb:J03250 DNA 
TOPOISOMERASE I (HUMAN); 


|H. sapiens flow-sorted chromosome 6 Hindill fragment, SC6pA28B10 j 


|H.sapiens flow-sorted chromosome 6 HIndlll fragment, SC6pA28B1 0 | 


;Homo sapiens monocyte/neutrophil elastase inhibitor gene, complete cds | 


CONTACTIN PRECURSOR (GLYCOPROTEIN GP135) | 


i CONTACTIN PRECURSOR (GLYCOPROTEIN GP135) j 


iyv50g11.r1 Scares fetal liver spleen 1NFLS Homo sapiens cDNA clone 1MAGE:246212 5' | 


:PM4-NN0050-310300-001-f10 NN0050 Homo sapiens cDNA | 


IC4B-BINDING PROTEIN PRECURSOR (C4BP) j 


;C4B-BINDING PROTEIN PRECURSOR (C4BP) | 


ye28c12.r1 Stratagene lung (#937210) Homo sapiens cDNA clone IMAGE:1 19062 5' | 


yi59d08.s1 Scares placenta Nb2HP Homo sapiens cDNA clone IMAGE:143535 3' similar to contains Alu I 
irepetitive element;contains LTR7 repetitive element ; ( 


zk58f02.r1 Soares_pregnant_uterus_NbHPU Homo sapiens cDNA clone IMAGE:487035 5' | 


; MRO-NT0038-250400-001-f09 NT0038 Homo sapiens cDNA | 


Top HH 
Database 


esthuman 


EST HUMAN 




EST HUMAN 


iEST HUMAN 


'SWISSPROT | 


iNT 


NT 


z 


NT 


NT | 


EST HUMAN j 


z 
1 


NT | 


NT 


EST HUMAN 


NT 


NT 


NT 


SWISSPROT I 


SWISSPROT f 


EST HUMAN ' 


X 


SWISSPROT | 


SWISSPROT I 


EST_HUMAN ( 


EST HUMAN | 


EST_HUMAN [ 


EST_HUMAN \ 


TopHitAcession 
No. 


AA279333.1 


IAW847445.1 | 


AW847445.1 | 


L49075.1 | 


L49075.1 j 




AL163278.2 j 


1 


1 


l 


9845300 | 


1 
< 


T07095.1 | 




48851701 


< 


Z84506.1 | 


Z84506.1 j 


1 
< 


Q12860 | 


Q12860 j 


N72829.1 i 


1 






T94149.1 j 


R75639.1 i 


AA044015.1 j 


| I.0H-068MV 


Most Similar 
(Top) Hit 
BLAST E 
Value 


| 8.OE-O5I 


j 7.0E-05| 


| 7.0E-O5| 


i 7.0E-05| 


! 7.0E-O5| 


! 7.0E-05I 


j 7.0E-O5| 


| 7.0E-05| 


I 7.0E-05J 


1 7.0E-O5I 


] 7.0E-05| 


| 7.0E-O5| 


j 7.0E-05| 


j 7.0E-05| 


6.0E-05I 


6.0E-05 


] 6.0E-O5| 


j 6.0E-05| 


| 6.0E-05| 


| 6.0E-O5| 


| 6.0E-05[ 


| 6.0E-05| 


3 
§ 


| 6.0E-05| 


j 6.0E-05| 


! 6.0E-05| 


6.OEO5I 


! 6.0E-05| 


! 6.0E-05| 


11 

r 


$ 






I 1-05| 












1 °- 82 ! 








§ 






| 0.89| 




1 2.45| 










1 1.37| 






| 3.6s| 




I 8.44 1 


ORFSEQ 
ID NO: 




| 20118, 


| 20119 








| 22440, 


| 22843| 




| 240231 


J 24506j 




| 27505| 




il 




| 22409! 


I 


| 2041 5| 




| 25600 | 


1 




| 27437| 


| 27438j 


| 27558| 




1 


| 25073 | 


SEQ ID 
NO: 


1 




1 0303 


1 


I 






130461 


141 75 1 


14241 


1 4723 


I 


1 7296; 




11874 




I2519 1 




5 


15519| 


15519 




1 


I7234| 


1 


1 




1 


1 9587 1 




























































Probe 
SEQ ID 

NO: 


1 


344 




8 


8 


| 103q| 


I 


| 3121 | 


| 4276| 


I 4344J 


| 4842 | 


| 5202 1 


| 7508 1 


| 8501 1 


I 19811 


2544 


1 


1 


| 2783 | 


1 


1 


1 


i 


| 7330 | 


I 7330J 


| 7484| 


| 8131 | 


| 8807| 
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158 3' similar to | 




















158 3' similar to | 














Top Hit Descriptor 


7g28a08.x1 NCI_CGAP_Brn23 Homo sapiens cDNA clone IMAGE:3307766 3' 


QV4-ST0234-241199-O40-M1 ST0234 Homo sapiens cDNA 


Homo sapiens 22kDa peroxisomal membrane prctein-like (LOC55895), mRNA 


Homo sapiens MEP1A gene, promoter region and exon 1 


Homo sapiens partial SLC22A3 gene for extraneuronai monoamine transporter (EMT), exon ' 


Human MLC1 emb gene for embryonic myosin alkaline light chain, 3'UTR 


AV653544 GLC Homo sapiens cDNA clone GLCDMA06 3' 


RETINAL-BINDING PROTEIN (RALBP) 


RETINAL-BINDING PROTEIN (RALBP) 


Human renin (REN) gene, 5' flanking region 


Homo sapiens PP1200 mRNA, complete cds 


hi36c07.x1 Soares_NFL_T_GBC_S1 Homo sapiens cDNA clone IMAGE:2974380 3' similar 
element MIR repetitive element ; 


Xd93e09.x1 Soares_NFL_T_GBC_S1 Homo sapiens cDNA clone IMAGE:26051 92 3' 


qh64c10.x1 Soares_fetal_liver_spleen_1NFLS_S1 Homo sapiens cDNA clone IMAGE:1849' 
contains Alu repetitive element.confains element KER repetitive element ; 


™. 
C5 
I 

g 
0 

z 

9 

.1 

X 

0 
8 

z 
SS 
8 

i 


601461 463F1 NIHJ\,1GC_66 Homo sapiens cDNA clone IMAGE:3865142 5' 


601461 463F1 NIH_MGC_66 Homo sapiens oDNA clone IMAGE:3865142 5' 


SKELEMIN 


PM1-HT0521-120200-001-e10 HT0521 Homo sapiens cDNA 


PM1-HT0521-120200-001-e10 HT0521 Homo sapiens cDNA 


EST79996 Placenta I Homo sapiens cDNA similar to similar to p53-associated protein 


EST7999S Placenta 1 Homo sapiens cDNA similar to similar to p53-associated protein 


Homo sapiens NOD1 protein (NOD1) gene, exons 1, 2, and 3 


qh64c10.x1 Soares_fetal_liver_spleen_1NFLS_S1 Homo sapiens cDNA clone IMAGE:1849< 
contains Alu repetitive eIem'ent;contains element KER repetitive element ; 


AV726630 HTC Homo sapiens cDNA clone HTCCEA01 5' 


Mus musculus myosin light chain 2, precursor lymphocyte-specific (Mylc2pl), mRNA 


601567451 F1 NIH_MGC_21 Homo sapiens cDNA clone IMAGE:3842292 5' 


zs60b05.s1 Stratagene schizo brain S1 1 Homo sapiens cDNA done IMAGE:701841 3' 


h!94e08.x1 NCI_CGAP_Lu24 Homo sapiens cDNA clone IMAGE:3009638 3' 


Homo sapiens interleukin-1 receptor antagonist homolog 1 (1L1HY1), mRNA 
EST84475 Colon adenocarcinoma IV Homo sapiens cDNA 5' end 


Top Hit 
Database 


z 

1 
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ESTHUMAN 


NT | 


NT | 


NT I 


z 


EST HUMAN ! 


SWISSPROT ! 
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EST HUMAN j 


EST HUMAN I 
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NT I 




1 

£2 


NT i 


EST HUMAN | 
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z 

1 


NT I 
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Top Hit Acession 
No. 


BE858403.1 j 


AW392085.1 | 




AJ251058.1 


AJ251884.1 | 


X58855.1 j 


AV653544.1 I 


1 


s 
1 


U12821.1 j 


AF202635.1 | 


1 


I 

i 
< 




AW273851.1. j 


1 


BF037898.1 \ 


Q62234 | 


BE169211.1 


BE169211.1 ! 


1 


1 
1 


AF1 49773.1 j 


1 
< 


< 


110721 02| 


BE733157.1 ! 


AA284049.1 


AW770982.1 | 


6912431 1 
AA372562.1 


Most Similar 
(Top) Hit 
BLAST E 
Value 


6.0E-05| 


5.0E-05j 


5.0E-05! 


3 


5.0E-05j 


5.0E-05| 
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j? 
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4.0E-05! 


4.0E-0s| 
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20417| 






20871 | 


1 


I 
S3 


1 


I 


24046| 


24197| 




1 




26677| 
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MER3.b2 MER3 repetitive element ; 

Human adenosine deaminase (ADA) gene, complete cds — *- 

zq46a12.n Stratagene hNT neuron (#937233) Homo sapiens cDNA clone IMAGE:632734 5' similar to " 
contains Alu repetitive element;contalns element L1 repetitive element • 

RC3-BT0319-120200-014-h08 BT0319 Homo sapiens cDNA 

Homo sapiens p47-phox (NCF1) gene, complete cds 
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Top Hit Descriptor 


xa89a03.x1 NCI_CGAP_Co1 7 Homo sapiens cDNA clone IMAGE:2573932 3' similar to contains L1 
repetitive element ; 


Homo sapiens ABCA1 (ABCAI)gene, complete cds 


AU131 513 NT2RP3 Homo sapiens cDNA clone NT2RP3002707 5' 


Homo sapiens chromosome 21 segment HS21C082 


Drosophila melanogaster strain Lamto 120 Suppressor of Hairless (Su(H)) gene, partial cds 


MOSAIC PROTEIN LGN 


Homo sapiens chromosome 21 segment HS21C003 


I 

| 
1 
1 

1 
I 

I 

z 
1 

1 


1 

2 

o 
D 

t 
| 

Jj 

i 
i 


Homo sapiens chromosome 21 segment HS21C046 


Homo sapiens Spast gene for spastin protein 


ns19g02.s1 NCI_CGAP_GCB1 Homo sapiens cDNA clone IMAGE:1184114 3' similar to contains L 
L1 repetitive element ; 


Homo sapiens phospholipase A2, group X(PLA2G10) mRNA, and translated products 


52 KD RO PROTEIN (SJOGREN SYNDROME TYPE A ANTIGEN (SS-A)) (RO(SS-A)) 


Homo sapiens chromosome 21 segment HS21C027 


zx36M2.s1 Soares_total_fetus_Nb2HF8_9w Homo sapiens cDNA clone IMAGE:788519 3' similar tt 
gb:L02932 PEROXISOME PROUFERATOR ACTIVATED RECEPTOR ALPHA (HUMAN); 


zs05e1 1 ,r1 NCI_CGAP_GCB1 Homo sapiens cDNA clone IMAGE:684332 5' similar to contains Alu 
repetitive e!ement;contains element TAR1 repetitive element ; 


UI-H-BI2-agk-a-08-0-Ul.s1 NCI_CGAP_Sub4 Homo sapiens cDNA clone IMAGE:2724398 3' 


UI-H-BI2-agk-a-08-0-Ui.s1 NCI_CGAP_Sub4 Homo sapiens oDNA clone IMAGE:2724398 3' 


ha07c10.x1 NCI_CGAP_Kid12 Homo sapiens cDNA clone IMAGE:287301 0 3' similar to contains L" 
repetitive element ; 


Human hereditary haemochromatosis region, histone 2A-like protein gene, hereditary haemochromat 
(HLA-H) gene, RoRet gene, and sodium phosphate transporter (NPT3) gene, complete cds 


. 

: 

. 
: 

J 

' 

j 

: 


(HLA-H) gene, RoRet gene, and sodium' phosphate transporter (NPT3) gene, complete cds 


tt73a06j:1 NCI_CGAP_HSC3 Homo sapiens cDNA clone IMAGE:2246386 3' 
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ID NO: 
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SEQ ID 
NO: 


19677 


1 




1 
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14627 1 


14763 1 
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16079 1 


16430 1 


17045] 


17117| 


17220 | 


17554| 




17701 1 
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18170 


12504| 
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SEQ ID 
NO: 
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I 
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Top Hit Descriptor 


Human alanineiglyoxylate aminotransferase (AGXT) gene, exons 1 and 2 | 


Homo sapiens differentiation antigen CD20 gene, exons 5, 6 | 


ox20g01 xl Soares_fetal_liver_spleen_1NFLS_S1 Homo sapiens cDNA clone IMAGE:1656912 3' similar to I 
jcontains Alu repetitive element; | 


Homo sapiens chromosome 21 segment HS21 C009 | 


SUSHI REPEAT-CONTAINING PROTEIN SRPX PRECURSOR (DRS PROTEIN) (DOWN-REGULATED 1 
BYV-SRC) | 


SUSHI REPEAT-CONTAINING PROTEIN SRPX PRECURSOR (DRS PROTEIN) (DOWN-REGULATED 1 
BYV-SRC) | 


Human apolipoprotein E (APOE) gene, hepatic control region HCR-2 j 


PUTATIVE SERINE/THREONINE-PROTEIN KINASE C22E12.14C | 


RC3-CT0283-201199-011-h11 CT0283 Homo sapiens cDNA j 


ANKYRIN 1 (ERYTHROCYTE ANKYRIN) | 


ANKYRIN 1 (ERYTHROCYTE ANKYRIN) [ 


ab90f 1 0.s1 Stratagene lung (#937210) Homo sapiens cDNA clone IMAGE:854251 3' similar to contains I 
MER20.M MER20 repetitive element ; J 


Homo sapiens KIAA0555 gene product (KIAA0555), mRNA 1 


qw16g09jt1 NCI_CGAP_Ut3 Homo sapiens cDNA clone IMAGE:1991296 3' similar to contains Alu repetitivel 
i element; j 


EST99205 Thyroid Homo sapiens cDNA 5' end similar to EST containing L1 repeat j 


QV2-OT0062-250400-1 73-h01 OT0062 Homo sapiens cDNA j 


601 881 522F1 NlH_MGC_57 Homo sapiens cDNA clone IMAGE:4093972 5' | 


QV3-BT0379-010300-105-d11 BT0379 Homo sapiens cDNA | 


QV3-BT0379-010300-105-d11 BT0379 Homo sapiens cDNA | 


OVARIAN ABUNDANT MESSAGE PROTEIN (OAM PROTEIN) | 


ox08e02jct Soares_fetal_liver_spleen_1 NFLS _S1 Homo sapiens cDNA clone IMAGE.1655738 3' similar to I 
contains MER8.t2 MER8 repetitive element ; ! 


Mus musculus E-cadherin binding protein E7 mRNA, complete cds I 


IL5-UM0070-110400-C63-g02 U.MCC7C Homo sapiens cDN A | 


Homo sapiens calcium channel, voltage-dependent, alpha 1 1 subunit (CACNA1 1), mRNA j 


Homo sapiens chromosome 21 segment HS21 C046 | 


;Human ABL gene, exon 1b and intron 1b, and putative M8604 Met protein (M8604 Met) gene, complete cds | 


EST185496 Colon carcinoma (HCC) cell line Homo sapiens cDNA 5' end | 


Top Hit 
Database 


z 


z 


;EST HUMAN | 


z 


SWISSPROT | 


SWISSPROT 1 


Z 


SWISSPROT | 


EST HUMAN [ 


SWISSPROT | 


SWISSPROT | 


EST HUMAN | 


z 


EST HUMAN | 


EST HUMAN | 


EST HUMAN | 


| 


EST HUMAN | 


EST HUMAN | 


SWISSPROT | 


EST HUMAN | 




EST HUMAN [ 








EST_HUMAN | 


Top Hit Acession 
No. 


1 
1 


I 




1 
§ 


§ 
1 


§ 
I 


U35114.1 I 


Q10364 ] 


AW362539.1 | 


Q02357 || 


Q02357 | 


i 

I 




I 

< 


AA385542.1 \ 


AW 883141.1 | 




BE069189.1 | 


i 
1 


Q01456 j 


| 
< 


AF1 67441.1 | 


AW801912.1 | 


11418157| 


AL1 63246.2 [ 


U07561.1 \ 


AA313620.1 | 


Most Similar 
(Top) Hit 
BLAST E 
Value 


a.0E-06| 


9.0E-06| 


I 9.0E-0e| 




9.0E-Oe| 


i aoE-cral 


9.0E-06| 


9.0E-06| 


8.0E-06| 


8.0E-06| 


8.0E-06| 


7.0E-Oe| 


7.0E-06| 


7.0E-06) 


7.0E-06| 


7.0E-06| 




S 


S 


6.0E-06| 


i 


6.0E-06| 


6.0E-06| 


8 


5.0E-06| 


90-309 


5.0E-06| 


Expression 
Signal 


1 


SB 
w 










s 














1 




























ORFSEQ 
ID NO: 






| 26611. 


s 


| 272S7 


| 27258 


27394! 


28434| 


22251 1 
















§ 


22607[ 




22637| 


1 


S 
I 




8 


25706| 


§ 


27907| 


SEQ ID 
NO: 


13473 






I 


I 


| 17069 




1 


12722| 


1 


1 


1 


11326, 


a 


13432 


15412| 


1 


12814| 


1 


12838| 




15203' 






15604| 




1 


Probe 
SEQ ID 

NO: 


8 


1 




i 


8 


| 71S2| 


7317 


I 


2483| 


I 


1 


1 


1420| 


1 


3516| 


1 


9072| 


2887| 


1 


4647| 


§ 


5281 1 


7670 1 




1 


1 
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Top Hit Descriptor 


PROTEIN-ARGININE DEIMINASE TYPE IV (PEPTIDYLARGININE DEIMINASE IV) (PAD-R4) 1 
^PEPTIDYLARGININE DEIMINASE TYPE ALPHA) | 


:oc04c10.s1 NCI_CGAP_GCB1 Homo sapiens cDNA clone IMAGE:1339890 3' j 


! QV1-UM0036-200300-115-g02 UM0036 Homo sapiens cDNA j 


Homo sapiens chromosome 9 duplication of the T cell receptor beta locus and trypsinogen gene families | 


Rattus norvegicus mRNA for 45 kDa secretory protein, partial j 


Homo sapiens TRF2-interacting telomeric RAP1 protein (RAP1) mRNA, complete cds ) 


Homo sapiens DiGeorge syndrome critical region, telomeric end j 


Homo sapiens DiGeorge syndrome critical regbn, telomeric end I 


Fugu rubripes beta-cytoplasmic( vascular) actin gene, complete cds J 


Homo sapiens homeobox protein CDX4 (CDX4) gene, complete cds and flanking repeat regions I 


Homo sapiens homeobox protein CDX4 (CDX4) gene, complete cds and flanking repeat regions | 


zr08b07.s1 Stratagene NT2 neuronal precursor 937230 Homo sapiens cDNA clone IMAGE:650869 3' similar 
to gb:L3186D GLYCOPHORIN A PRECURSOR (HUMAN);contains Alu repetitive element; 


yc15g04.s1 Stratagene lung (#937210) Homo sapiens cDNA clone IMAGE:80790 3' similar to contains L1 I 
repetitive element ; J 


1/6 AUTOANTIGEN | 


HYPOTHETICAL 72.5 KD PROTEIN C2F7.10 IN CHROMOSOME I > 


Homo sapiens caveolin 1 (CAV1) gene, exon 3 and partial cds | 


xa05h07.x1 Scares NFL_T GBC_S1 Homo sapiens cDNA clone IMAGE:2567485 3' similar to WP:C38H2.1 1 
CE00923 PROBABLE RABGAP DOMAINS ; | 


xa05h07.x1 Soares_NFL_T GBC_S1 Homo sapiens cDNA clone IMAGE:2567485 3' similar to WP:C38H2.1| 
CE00923 PROBABLE RABGAP DOMAINS ; j 


Homo sapiens chromosome 21 segment HS21C1 01 j 


RC3-NN0066-260400-021 -g1 1 NN0066 Homo sapiens cDNA j 


E 
1 

1 

1 
< 

z 
1 

! 

1 

i 

i 

1 


AV729390 HTC Homo sapiens cDNA clone HTCAEG02 5' j 


Homo sapiens chromosome 21 segment HS21C103 | 


CM4-NN0003-280300-124-e06 NN0003 Homo sapiens cDNA \ 


zn85h1 1 .x5 Stratagene lung carcinoma 93721 8 Homo sapiens cDNA clone IMAGE:565029 3' similar to I 
contains THR.b2 THR repetitive element ; ! 


Homo sapiens chromosome 21 segment HS21C082 I 




o 


IUMAN j 


2 
| 


















z 

3 


IUMAN | 


3PROT | 


5PROT | 




2 

s 


IUMAN | 




2 
5 


z 

5 
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z 
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iSIAAS 


| EST b 


EST b 


z 


NT 


z 
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NT 


| NT 


NT 


EST b 
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1 


5SIMS 


NT 




EST b 


NT 


EST I- 


EST 1- 


EST h 


NT 


EST b 


EST b 


NT 


Top Hit Acession 
No. 


1 


AA815175.1 ; 


AW797168.1 | 


AF029308.1 | 


AJ132352.1 | 


< 


| 


l 


j 


AF003530.1 j 


AF003530.1 | 


AA223260.1 


T63042.1 | 


Q26768 ! 


Q09701 I 


AF1 25348.1 | 


AW070995.1 | 


AW070995.1 | 


AL163301.2 ; 


I 

1 
i 


AI208715.1 | 


AV729390.1 I 


AL1 63303.2 | 


AW892507.1 | 


AI732462.1 | 


AL1 63282.2 ( 


imilar 
) Hit 
;te 


3 


DE-07 1 


3 


DE-07 


DE-07 


3 


DE-37 


3 


3 


DE-07 


DE-07 


3 


DE-07 


DE-07 


DE-37! 


3 


3 


DE-07 


3 


DE-07 


DE-07 


3 


DE-07 


3 


3E-07I 


DE-07 
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! 3.91[ 


| 1.61 
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| 10122j 


j 10122! 
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1 


10853 




1 


| 11486| 


| 13547| 


I 
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i 15199| 
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| 17478j 




| 195571 


j 11002S 
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Top Hit Descriptor 


|ANKYRIN 1 (ERYTHROCYTE ANKYRIN) ] 


| Rat mRNA for ribosomal protein L31 J 


|DYNEIN HEAVY CHAIN (DYHC) 


|DYNEIN HEAVY CHAIN (DYHC) j 


|cong3.P1 1 .A5 conorm Homo sapiens cDNA 3' j 


|Rattus norvegicus Munc13-1 mRNA, complete cds 


|DYNEIN HEAVY CHAIN (DYHC) 


| DYNEIN HEAVY CHAIN (DYHC) 


| Homo sapiens SCL gene locus 


iHomo sapiens chromosome 21 segment HS21 C048 


| Homo sapiens chromosome 21 segment HS21 C048 


| MR0-HT01 66-1 91 1 99-004-g09 HT01 66 Homo sapiens cDNA | 


|Homo sapiens chromosome 21 segment HS21 C048 j 


IRETROVIRUS-RELATED POL POLYPROTEIN [CONTAINS: REVERSE TRANSCRIPTASE ; 
jENDONUCLEASE] 


|Homo sapiens chromosome 21 segment HS21 C009 | 


|Homo sapiens chromosome 21 segment HS21 C103 j 


|nh03b09.s1 NCI_CGAP_Thy1 Homo sapiens cDNA done IMAGE:943193 similar to contains Alu repetitive | 


|ALDEHYDE OXIDASE | 


(COMPLEMENT C2 PRECURSOR (C3/C5 CONVERTASE) j 


|QV0-CT0225-131099-034-a12 CT0225 Homo sapiens cDNA | 


|DORSAL-VENTRAL PATTERNING TOLLOID PROTEIN PRECURSOR j 


| DORSAL-VENTRAL PATTERNING TOLLOID PROTEIN PRECURSOR | 


| Cricetulus griseus ribosomal transcription factor (UBF2) mRNA, complete cds ! 


jan22d10.x1 Gessler Wilms tumor Homo sapiens cDNA clone IMAGE:1699411 3' similar to contains Alu i 
|repetitive element;contains element MER22 repetitive element ; | 


|Homo sapiens mRNA for UGA suppressor tRNA-associated antigenic protein (tRNA48 gene) [ 


if) 
o 

i 

I 
< 

I 
f 
| 

X 

8, 
1 

x 1 
z 

1 

8 


|602248024F1 NIH_MGC_S2 Homo sapiens cDNA clone IMAGE:4333300 5' i 


zd65g03.r1 SoaresJetal_heart_NbHH19W Homo sapiens cDNA clone IMAGE:345556 5' similar to contains I 
|L1.t1 L1 repetitive element; j 


tb95a11.x1 NCI_CGAP_Co16 Homo sapiens cDNA clone IMAGE:2062076 3' similar to contains MER18.b3 I 
|MER18 MER18 repetitive element ; [ 
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O 


3PROT 


lUMAN 
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1 


O 








lUMAN 
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3PROT 


O 


lUMAN | 
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O 
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z 
1 


*|* 
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SSI MS 
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X 


z 


X 


I 
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p Hit Acession 
No. 


1 


1 




1 


35743.1 | 


1 




1 


31016.1 ; 


63248.2 I 


63248.2 [ 


! 


63248.2 


1 


63209.2 | 


1 


! 




1 


i 




IP25723 | 


:571.1 


50027.1 j 


1 
< 


592493.1 | 


1 

j8 


6159.1 j 








I 














< 








< 






< 








< 
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Most Similar 
(Top) Hit 
BLAST E 
Value 


7.0E-08j 


7.0E-08| 


7.0E-08| 


s 


7.0E-08| 


s 


7.0E-08| 




7.0E-08| 


6.0E-08i 


6.0E-08| 


6.0E-08 




6.0E-08 


6.0E-08| 




5.0E-0s| 




5.0E-08| 




4.0E-08| 


4.0E-08| 


4.0E-08| 


4.0E-Os| 


s 


4.0E-08| 


4.0E-08| 


I 


4.0E-Os| 


Expression 
Signal 




| 12.71| 
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ri 
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ORFSEQ 
ID NO: 
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| 23240 | 






j 29027| 


i 23240 i 






1 

R 


R 
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21981 | 








1 
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3 
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| 285711 








10057 
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13444' 


1 


1 8075! 


I8734| 
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1 8632: 
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1 
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18313 
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| 3528| 
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a 
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| 9770 | 


s 
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| 4155| 






B 
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P 


| 9057| 
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| 1724| 
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Top Hit 
Database 

EST HUMAN 
EST HUMAN 


il 1 ii I ill ii i o&i i iii 
S3 5 B3 ! P, 1 III, 1, Mi 


w 


Top HitAcessior 
No. 

AA191 195.1 
AA191 195.1 


111 1 II IIIIIIII liiili 1 III 




MostSimila 
(Top) Hit 
BLAST E 
Value 

3.0E-0f 
3.0E-0J 


3.0E-OE 
3.0E-06 

3.0E-08 

3.0E-08 
2.0E-08 

2.0E-08 
2.0E-08 
2.0E-08 
2.0E-C8 
2.0E-08 
2.0E-08 
2.0E-08 
2.0E-08 

2.0E-08 , 
2.0E-08 1 
2.0E-08 I 
2.0E-08 ( 
2.0E-08 / 
2.0E-08 I 

2.0E-08 f 

2.0E-08 fi 
2.0E-C8 A 
2.0E-08|e 




Expression 
Signal 

2 7.2' 

3 7.2* 


3.58 

I. 41 

2.97 

II. 6 
15.76 

7.97 
1.48 
10.93 
10.93 
37.42 

3.99 

1.42 
2.16 
6.02 
8.02 
1.66 
0.78 

3.35 

2.89 
1.31 
11.62 




ORFSEC 
ID NO: 

2481: 
2481J 


2486' 
2647S 

20245 
20396 
20397 

21082 

22140 

22901 
22902 

23669 

26791 
29092 






15120 
16312 

16416 

18819 
101711 

10194 
10431 
10581 
10581 
10898 
11226 
11608 
11708 

12248 
12371 
13096 
13096 
13695 
13893 

14203 

14756 
16602 
8799 




Probe 
SEQ ID 
NO: 

5184 
5184 

5446 


6163 
6451 

6558 

9027 
199 

223 
488 
644 
644 
974 
1319 
1707 
1811 

2368 
2496 
3171 
3171 
3783 
3986 

4305 

4876 
6722 
8995 
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Top Hit Descriptor 


POL POLYPROTEIN [CONTAINS: REVERSE TRANSCRIPTASE ; ENDONUCLEASE] | 


PM2-HT01 30-1 50999-001 -f1 2 HT0130 Homo sapiens cDNA | 


TCBAP1 D5232 Pediatric pre-B cell acute lymphoblastic leukemia Baylor-HGSC project=TCBA Homo I 
sapiens cDNA clone TCBAP5232 j 


TCBAP1 D5232 Pediatric pre-B cell acute lymphoblastic leukemia Baylor-HGSC prqject=TCBA Homo I 
, sapiens cDNA clone TCBAP5232 | 


Homo sapiens chromosome 21 segment HS21C080 | 


Homo sapiens hyperion gene, exons 1-50 | 


ot35a05.s1 SoaresJestisJvJHT Homo sapiens cDNA clone IMAGE:1618736 3' j 


Homo sapiens major histocompatibility locus class III region | 


Human lambda-immunoglobulin constant region complex (germline) j 


Human lambda-immunoglobulin constant region complex (germline) | 


MR4-ST0240-240700-01 3-g04 ST0240 Homo sapiens cDNA | 


Homo sapiens chromosome 21 segment HS21C079 j 


Homo sapiens chromosome 21 segment HS21C079 { 


RC5-BN1058-270400-031-C06 BN1 058 Homo sapiens cDNA J 


qd42e07.x1 Soares_fetal_heart_NbHH19W Homo sapiens cDNA clone IMAGE:1732164 3' similar to j 
contains MSR1 .t1 MSR1 repetitive element ; | 


CM0-NN1004-100300-273-e06 NN1004 Homo sapiens cDNA j 


op74d08.s1 Soares_NFL_T_GBC_S1 Homo sapiens cDNA clone IMAGE:1582575 3' j 


Homo sapiens DNA for 3-ketoaoyl-CoA thiolase beta-subunit of mitochondrial trifunctional protein, exon 2, 3 | 


Homo sapiens gene for enteric smooth muscle gamma-actin, exon 2, 3 j 


Human lysosomal membrane glycoprotein-2 (LAMP2) gene, 5' end and flanking region j 


601111173F1 NIH_MGC_16 Homo sapiens cDNA ctone IMAGE:3351834 5' j 


DKFZp434C0514_r1 434 (synonym: htes3) Homo sapiens cDNA clone DKFZp434C0514 5' ] 


nh 7a1 1 .s1 NCI_CGAP_HSC1 Homo sapiens cDNA clone IMAGE:1040924 similar to contains L1 .t2 L1 I 
repetitive element ; S 


PM1-HT0527-160200-001-h05 HT0527 Homo sapiens cDNA j 


xn85h08.x1 Soares_NFL_T_GBC_S1 Homo sapiens cDNA clone IMAGE:2701311 3' | 


Homo sapiens fibroblast growth factor receptor 3 (achondroplasia, thanatophoric dwarfism) (FGFR3) mRNA | 


Homo sapiens testis-specific kinase substrate (TSKS) gene, complete cds | 


RC2-HT0252-1 20200-01 4-h1 0 HT0252 Homo sapiens cDNA j 


Top Hit 


ource 


SWISSPROT [ 


EST HUMAN | 


I EST HUMAN | 


lEST HUMAN | 


NT j 


NT | 


| 


NT ! 


NT | 


NT I 


EST HUMAN J 
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NT | 


EST HUMAN | 
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EST HUMAN | 


EST HUMAN | 


NT | 


NT J 


NT j 


EST HUMAN | 


EST HUMAN | 


EST HUMAN I 


EST HUMAN | 


EST HUMAN | 


NT | 


NT [ 


ESTHUMAN | 


J 
1 

e 


No. 
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BE1 41 959.1 | 


1 
£j 


1 


| 
< 


| 


AI01 5304.1 i 


AF044083.1 j 


X51 755.1 j 


X51 755.1 i 


BF375398.1 | 


AL1 63279.2 i 


AL1 63279.2 [ 


BE012076.1 ! 


AI1 83500.1 ! 


AW900159.1 | 


1 
1 




s 


L09709.1 | 


BE254850.1 j 


AL040439.1 | 


i 
I 




AW195784.1 | 


i 4503710 1 


AF200923.2 ( 
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Most Similar 
(Toirt Hit 


BLAST E 
Value 


1.0E-08 


1.0E-08 
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1.0E-08 
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1.0E-08 
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8.0E-09' 
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7.0E-09' 
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6.0E-09 1 
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I 
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7850 ( 
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| 7294! 
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1 

: 


! 


EST68746 Fetal lung II Homo sapiens cDNA 5' end 1 


| PM2-UM0053-240300-005-C09 UM0053 Homo sapiens cDNA | 


| Homo sapiens chromosome 21 segment HS21 C082 j 


] Homo sapiens chromosome 21 segment HS21C085 | 


|Homo sapiens hypothetical protein (AF038169), mRNA | 


IEST58385 Infant brain Homo sapiens cDNA 5' end similar to similar to heat shock protein, 90 kDa j 


Ihu09e09.x1 NCI_CGAP_Lu24 Homo sapiens cDNA clone IMAGE:3166120 3' similar to contains MER18.t3 I 
| MER1 8 repetitive element ; I 


hu09e09.x1 NCI_CGAP_Lu24 Homo sapiens cDNA clone IMAGE:31 66120 3' similar to contains MER1 B.t3 I 
j MER1 8 repetitive element ; | 


j PROTEIN MOV-10 | 


hu09e09.x1 NCi_CGAP_Lu24 Homo sapiens cDMA clone IMAGE:3166120 3' similar to contains MER18.t3 I 
MER1 8 repetitive element ; | 


] Homo sapiens eukaryotic initiation factor 4AI (EIF4A1 ) gene, partial cds | 


|258.1 KDA PROTEIN C210RF5 (K1AA0933) j 


J Homo sapiens chromosome 21 segment HS21C047 i 


1 7l72c08.x1 Soares_NSF_F8_9W_OT_PA_P_S1 Homo sapiens cDN A clone IMAGE:3527030 3' j 


1 

1 
I 

< 

1 

1 

f 
1 

X 

< 
0 

1 


] H.sapiens PADPRP-I gene for NAD(+) ADP-ribosyltransferase 


| Homo sapiens chromosome 21 segment HS21 C084 I 


|DKFZp761B1710_r1 761 (synonym: hamy2) Homo sapiens cDNA clone DKFZp761B1710 5' j 


1258.1 KDA PROTEIN C210RF5 (KIAA0933) | 


IBRAIN-SPECIFIC ANGIOGENESIS INHIBITOR 2 PRECURSOR i 


jzx63h06.r1 Scares Jotal_fetus_Nb2HF8_9w Homo sapiens cDNA clone IMAGE:796187 5' similar to contains) 
jAlu repetitive element; | 


] Homo sapiens Xq pseudoautosomal region; segment 1/2 ( 


Homo sapiens chromosome 21 segment HS21C048 I 


j H.sapiens PADPRP-I gene for NAD(+) ADP-ribosyltransferase j 


Inc11c02.r1 NCI_CGAP_Pr1 Homo sapiens cDNA clone IMAGE:1007810 similar to contains Alu repetitive I 
[element; j 


i Homo sapiens CCAAT-box-binding transcription factor (CBF2) mRNA j 


;Homo sapiens CCAAT-box-binding transcription factor (CBF2) mRNA | 


iHomo sapiens 959 kb contig between AML1 and CBR1 on chromosome 21q22; segment 1/3 | 


Top Hit 
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iEST_HUMAN 


iNT | 


INT ! 


.NT 


|EST HUMAN 


;EST_HUMAN 


i 
=> 


ISWISSPROT j 


z 
1 

X 




SWISSPROT 


NT i 


EST HUMAN i 


EST_HUMAN | 


INT I 


;nt ! 


jEST HUMAN | 


SWISSPROT j 


SWISSPROT | 


;EST HUMAN [ 


INT I 


NT 


!NT [ 


EST_HUMAN | 


z 


NT ! 


z 


Top Hit Acession 


No. 


AA359454.1 


I 


IAL163282.2 


| 
< 




i 
1 

i 




BE222239.1 ! 


1 


BE222239.1 


AF1 75325.1 I 


Q9Y3R5 


.163247.2 i 


I 


i 
§ 


6674.1 | 


i 

s 


.118573.1 ! 




O 


AA461 430.1 | 


AJ271735.1 I 


AL1 63248.2 | 


1 

X 


AA226070.1 ! 


i 


5031624| 


AJ229041.1 | 


< 






X 


< 




Most Similar 
(Top) Hit 


BLAST E 
Value 


| 5.0E-09! 


| 5.CE-09, 


| 4.0E-09| 


| 4.CE-09 


| 4.0E-09j 


1 4.0E-09! 


| 3.0E-09 


| 3.OE-O9I 


| 3.CE-09; 


| 3.OE-O9I 


j 3.0E-09| 


2 


! 3.0E-09| 


s 


! 3.0E-09| 


| 2.0E-09j 


I 2.0E-09| 


| 2.0E-09| 


9 

B 


j 2.0E-09| 


2.OE-O9I 


j 2.0E-09| 


! 2.0E-09| 


i 2.0E-09] 


2.OE-O9I 


i 1.0E-09| 


| 1.0E-09| 


1.0E-09| 


Expression 






! 2.69, 


8 




0.95| 




3.82 


1.25 










1.73| 










10.52| 


2.25\ 














2.48 1 


2.48| 


0.95 1 


ORFSEQ 






I 27901 ; 












i 




§ 








s 


2851 1 | 
















8 






S 
§ 


20851 [ 




SEQ ID 
NO: 


| 15764| 


1 


| 10452 | 




| 11358| 








1 


1320s| 


1 


14305| 


177341 


18261 1 


18261 | 




3 


11541 | 








1 


5 






11009| 




1 


Probe 
SEQ ID 
NO: 


s 

8 


| 7811, 




1 


14531 


| 2379 


| 2303 


| 250e| 


| 2614| 


| 328/| 


R 
9 


I 


| 7884| 


s 


3 
£ 


I 795| 


| 1237| 


S 


| 2278| 


1 


1 


| 7055| 


1 


| 9576 | 






| 1083 | 
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c 


[ 

i 

\ 


qy64e1 1 .x1 NCI_CGAP_Brn25 Homo sapiens cDNA clone IMAGE:2016812 3' similar to contains MER12.t2 I 
MER12 repetitive element ; | 


Homo sapiens basic transcription factor 2 p44 (btf2p44) gene, partial cds, neuronal apoptosis inhibitory j 
protein (naip) and survival motor neuron protein (smn) genes, complete cds j 


Homo sapiens nucleolar phosphoprotein B23 (NPM1 ) mRNA, complete cds j 


Homo sapiens nucleolar phosphoprotein B23 (NPM1 ) mRNA, complete cds j 


MYOSIN LIGHT CHAIN KINASE, SMOOTH MUSCLE AND NON-MUSCLE ISOZYMES (MLCK) 1 
[CONTAINS: TELOKIN] | 


601058602F1 NIH_MGC_10 Homo sapiens cDNA clone IMAGE:3445177 5' ( 


zh35b03.s1 Soares_pineal_jgland_N3HPG Homo sapiens cDNA clone IMAGE:414029 3' similar to contains 1 
1 Alu repetitive element;contains element MER22 repetitive element ; | 


Human breakpoint cluster region (BCR) gene, complete cds | 


CIRCUMSPOROZOITE PROTEIN PRECURSOR (CS) | 


Homo sapiens chromosome 21 segment HS21C083 j 


Homo sapiens GTP binding protein 1 (GTPBP1), mRNA j 


MRO-SN0040-050500-002-C07 SN0040 Homo sapiens cDNA j 


|W©78h03.x1 Soares_Dieckgraefe_colon_NHCD Homo sapiens cDNA clone IMAGE:2347253 3' similar to 
|SW:RL29_HUMAN P47914 60S RIBOSOMAL PROTEIN L29 ;contains element PTR5 repetitive element ; 


tj46b09jc1 Soares_NSF_F8_9W_OT_PA_P_S1 Homo sapiens cDNA clone IMAGE:2144537 3' similar to I 
TR:O00372 000372 PUTATIVE P1 50. ; | 


Homo sapiens MCM4 (MCM4) and DNA-PKcs (PRKDC) genes, partial cds j 


QV1-BT0631-150200-071-f01 BT0631 Homo sapiens cDNA j 


EST89564 Small intestine I Homo sapiens cDNA 5' end | 


Homo sapiens lens major intrinsic protein (MIP) gene, complete cds j 


Homo sapiens chromosome 21 segment HS21C080 | 


Homo sapiens TPA inducible protein (LOC51586), mRNA j 


Homo sapiens TPA inducible protein (LOC5158S), mRNA | 


LYSP100 PROTEIN (LYMPHOID-RESTRICTED HOMOLOG OF SP100) I 


LINE-1 REVERSE TRANSCRIPTASE HOMOLOG j 


LINE-1 REVERSE TRANSCRIPTASE HOMOLOG | 


H.sapiens DHFR gene, exon 3 | 


EST51 247 Gall bladder 1 1 Homo sapiens cDNA 5' end j 


DNA-DIRECTED RNA POLYMERASE II LARGEST SUBUNIT | 


Top Hit 


I 8 
11 


EST HUMAN | 


NT 




NT | 


iSWISSPROT | 


| 


EST HUMAN 




SWISSPROT ! 


NT | 


NT | 


EST HUMAN j 


I 


| EST HUMAN | 


z 


EST HUMAN f 


EST_HUMAN | 


NT I 


NT j 


NT | 


NT I 


SWISSPROT | 


SWISSPROT | 


SWISSPROT j 


NT j 


EST HUMAN [ 


SWISSPROT | 


Top Hit Aoession 


1 


< 


U80017.1 


1 


:. 


P11799 | 


1 


AA71 9297.1 ; 


U07000.1 


1 


< 


11418127| 


AW867740.1 | 


< 


AI452982.1 


U63630.2 I 


BE080748.1 | 


! 


U36308.2 j 


| 
< 


7706225 1 


7706225 1 


Q13342 


1 


P08547 


X00856.1 | 


AA345220.1 \ 


P35084 | 


Most Similar 
(Top) Hit 


BLAST E 
Value 


| I.OE-09! 


| 1. OE-09 1 


9 


1. OE-09 1 


6C-30'L 




3 




3 


1. OE-09 1 




9.0E-10I 




9.0E-10 


8.0E-10| 


8.0E-10| 


8.0E-10| 


8.0E-10| 




7.0E-10| 


7.0E-10I 


7.0E-10| 


7.0E-10| 


7.0E-10| 


7.0E-10| 


7.0E-10| 


7.0E-10| 


Expression 




O.94I 
















3.21 1 
















s 


2.32| 


2.31 1 




12.451 








2.65 1 






ORF SEQ 


ID NO: 




22580 1 


1 


1 














24996 | 






i 












1 

R 










f? 


8 
s 




SEQ ID 

NO: 




| 12788| 


1 


12822[ 




1 






I 


17761 | 


§ 






I 16032I 




13222| 


1 


17575| 




1 




11510| 


11867| 


1 




15662| 


16373 1 


Probe 
SEQ ID 
NO: 


| 2453 




1 


1 


1 


.3002, 


1 




1 


7911 1 


s 


12871 


2803 


§ 




3300 


4106| 


7725. 




1 


1 


1 


1974 


2512 




5754 


6514[ 
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Top Hit Descriptor 


|Homo sapiens ASCL3 gene, CEGP1 gene, C11orf14 gene, C11orf15gene, C11orf16geneand C11orf17 I 
I gene [ 


1 
1 

z 

9 

1 

X 

u 

u 
z 


|RC3-CT0254-031099-012-g12 CT0254 Homo sapiens cDNA | 


] EST38401 2 MAGE resequences, MAGL Hot.o sapiens cDNA | 


|RC3-NN0070-110800-014-h07 NN0070 Homo sapiens cDNA I 


jDKFZp434N219 r1 434 (synonym: htes3) Homo sapiens cDNA clone DKFZp434N219 5" j 


[HYPOTHETICAL GENE 48 PROTEIN | 


j601 8221 84F1 NIH_MGC_75 Homo sapiens cDNA clone IMAGE:4042413 5' j 


j HYPOTHETICAL 67.9 KD PROTEIN ZK688.8 IN CHROMOSOME III ! 


] HYPOTHETICAL 67.9 KD PROTEIN ZK688.8 IN CHROMOSOME III | 


|qg09f09.x1 Soares_pIacenta_8to9weeks_2NbHP8to9W Homo sapiens cDNA clone IMAGE:1759049 3' 
[similar to contains LTR8.D2 LTR8 repetitive element ; | 


hg5Bg03.x1 NCl_CGAP_GC6 Homo sapiens cDNA clone IMAGE:2949844 3' similar to contains Alu 

j repetitive element; | 


j Homo sapiens chromosome 21 segment HS21 C1 03 j 


iHomo sapiens mannosidase, beta A, lysosomal (MANBA) gene, and ubiquitln-conjugating enzyme E2D 3 I 
(UBE2D3) genes, complete cds | 


IHomo sapiens X-linked anhidroitic ectodermal dysplasia protein gene (EDA), exon 2 and flanking repeat i 
1 regions I 


!yy32f06.s1 Scares melanocyte 2NbHM Homo sapiens cDNA clone IMAGE:272963 3' similar to contains I 
|L1.t1 L1 repetitive element; j 


Homo sapiens extracellular glycoprotein lacritin precursor, gene, complete cds j 


IHomo sapiens chromosome 21 segment HS21 C003 


|Homo sapiens chromosome 21 segment HS21 C003 i 


(RHOMBOID PROTEIN (VEINLET PROTEIN) | 


iba76d08.y1 NIH_MGC_20 Homo sapiens cDNA clone IMAGE:2906319 5' | 


IAV743302 CB Homo sapiens cDNA clone CBFBGD08 5' j 


[AV743302 CB Homo sapiens cDNA clone CBFBGD08 5' j 


ys74b12.s1 Scares retina N2b4HR Homo sapiens cDNA clone IMAGE:220511 3' similar to contains MER29 I 
repetitive element ; j 


1 IL3-CT021 9-1 60200-064-B06 CT021 9 Homo sapiens cDNA | 


1L3-CT0219-160200-054-B06 CT0219 Homo sapiens cDNA j 


In 

i 

| 

1 

I 

1 
X 

1 
I 
1 
1 

j 

1 

a 


Top Hit 
Database 


NT 


ESTJHUMAN ! 


2 

=> 
X 


iEST HUMAN | 


!EST HUMAN 


EST HUMAN | 


I SWISSPROT 


EST HUMAN | 


[SWISSPROT [ 


SWISSPROT [ 


EST HUMAN | 


| 

X 


NT I 


NT I 


NT | 


ESTJHUMAN | 


H 


INT I 


NT | 


SWISSPROT | 


EST HUMAN | 


! EST HUMAN [ 


EST HUMAN [ 


EST HUMAN | 


EST HUMAN | 


EST HUMAN | 


EST_HUMAN | 


Top Hit Acession 
No. 


|aJ400877.1 ! 


|A(424405.1 ! 


|AW853719.1 | 


|AW971923.1 [ 


I 


|AL046804.1 ] 


IQO1033 ; 


|BF105159.1 j 


)P34678 I 


P34678 | 


! AI221083.1 | 


AW 594709.1 | 


I AL1 63303.2 I 


< 


AF003528.1 | 


N36113.1 | 


AY005150.1 I 


AL1 63203.2 | 


: AH 63203.2 ) 


P20350 j 


BE302970.1 j 


I 


AV743302.1 j 


HB7208.1 | 


AW850731.1 | 


AW850731.1 | 


1 


Most Similar 
(Top) Hit 
BLAST E 
Value 
























































6.0E-- 




6.0E-- 




6.0E-" 


5.0E-- 






5.0E-- 




4.0E-' 


4.0E-- 


4.0E-- 


4.0E-" 


4.0E-" 


3.0E-" 


3.0E-' 


3.0E-' 


3.0E-- 


3.0E-" 


3.0E-' 


3.0E-- 




3.0E-- 


3.0E-' 


3.0E-" 


3.0E-" 


Expression 
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§ 






e 








19.23| 




1.72 
















J 




•L47I 


2.56 1 






















ORFSEQ 
ID NO: 


5 


1 










I 231 52| 




27570| 


27571 1 




21732| 


22292| 


2627s| 








241171 


24118| 




25851 | 


8 




2713e| 


27302| 


27303) 




' Exon . 
SEQID 
NO: 






14522 


1 


19597I 


1 0675 


1 3347 




1 


1 


1 0087 


1 




16125 


18310 




11236 




1 4330 






§ 


I 


I 


17110 




























































Probe 
SEQ ID 
NO: 


1 


1 


| 4634; 






? 


| 3430| 


1 


1 


I 


8 


| 1951 1 


| 2527| 


1 


1 




| 1329[ 


1 44351 


| 4435| 


| 5765| 




1 


1 


1 


| 7233 | 


I 7233I 


1 
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